COVID-19 Feature selection datasets

Published: 30 August 2022| Version 1 | DOI: 10.17632/wbjjz9bzxf.1
Gabriel Pena,


Datasets used to perform feature selection routines on COVID-19 peaks of cases and deaths. To be published soon. Data is from Our World in Data and Institute of Health Metrics and Evaluation (mobility and mask use attributes only).


Steps to reproduce

1. Apply a low pass filter on the reported data to eliminate noise (example: MA or Gaussian filters). 2. Choose real peaks and discard fake ones. 3. Normalize dependant attributes by country population. 4. Fill missing values by some imputation algorithm (example: kNN imputation). 5. Fix a number of output classes. (example: by bin number estimation). 6. Classify peaks (country + date) into the output classes. (example: by K-Means clustering)


Universidad Nacional de Tres de Febrero


Epidemiology, COVID-19