Generated Prediction Data of COVID-19's Daily Infections in Brazil

Published: 4 August 2020| Version 5 | DOI: 10.17632/t2zk3xnt8y.5
Contributor:
Mohamed Hawas

Description

Dataset general description: • This dataset reports 4200 recurrent neural network models, their settings, and their relevant generated files (including prediction csv files, graphs, and metadata files, as applicable), for predicting COVID-19's daily infections in Brazil by training on limited raw data (30 and 40 time-steps). The used code is developed by the author and located in the following online data repository link: http://dx.doi.org/10.17632/yp4d95pk7n.3 Dataset content: • Models, Graphs, and csv predictions files: 1. Deterministic mode (DM): includes 1197 generated models' files (30 time-steps), and their generated 2835 graphs and 2835 predictions files. Similarly, this mode includes 1976 generated models' files (40 time-steps), and their generated 7301 graphs and 7301 predictions files. 2. Non-deterministic mode (NDM): includes 20 generated models' files (30 time-steps), and their generated 53 graphs and 53 predictions files. 3. Technical validation mode (TVM): includes 1001 generated models' files (30 time-steps), and their generated 3619 graphs and 3619 predictions files for 349 models (out of a 358 sample but 9 models didn't achieve the accuracy threshold), which are a sample of 1001 models. Also, all data of the control group - India (1 model). 4. 1 graph and 1 prediction files for each of DM and NDM, reporting evaluation till 2020-07-11. 5. The evaluation of performance for 10, 20, 30, 40, and 50 time-steps alternatives (5 models). • Settings and metadata for the above 3 categories: 1. Used settings during the training session in json files. 2. Metadata: training / prediction setup and accuracy in csv files. Raw data source used to train the models: • The used raw data [1] for training the models is from: COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University) : https://github.com/CSSEGISandData/COVID-19 (accessed 2020-07-20) • The following raw data links were used (both accessed 2020-07-08): 1. till 2020-06-29: https://github.com/CSSEGISandData/COVID-19/raw/78d91b2dbc2a26eb2b2101fa499c6798aa22fca8/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv 2. till 2020-06-13: https://github.com/CSSEGISandData/COVID-19/raw/02ea750a263f6d8b8945fdd3253b35d3fd9b1bee/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv References: 1- Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Inf Dis. 20(5):533-534. doi: 10.1016/S1473-3099(20)30120-1

Files

Steps to reproduce

The steps that were performed to produce the models that were trained in a training session while using the same software and hardware, are reported in models' files metadata, the jsons' settings files of each model, the csv files, and the pkl files in the settings folder.

Categories

Infectious Disease, Time Series, Brazil, Forecasting Model, Recurrent Neural Network, Deep Learning, Statistical Prediction, COVID-19

Licence