OpenML study 7 - meta-datasets

Name: OpenML study 7 - meta-datasets
Creator: Ivan Olier
Published: 2020-11-04T18:00:10.860Z
Keywords: Machine Learning

Olier, Ivan

doi:10.17632/7xx7ty87x2.1

OpenML study 7 - meta-datasets

Published: 4 November 2020| Version 1 | DOI: 10.17632/7xx7ty87x2.1

Contributor:

Ivan Olier

Description

From OpenML we retrieved data from an earlier meta-learning study (Details can be found on https://www.openml.org/s/7). Although we had to exclude a few tasks and algorithms because they lacked sufficient evaluations in OpenML, this yielded a set of 10840 evaluations on 351 tasks (datasets) and 53 machine learning methods (called flows on OpenML) from mlr (Bischl et al., 2016). From each task, 21 dataset descriptors were extracted, such as the number of examples, number of missing values, and percentage of numeric features. We formed meta-datasets, one for each machine learning method. An observation within a meta-dataset represents an original OpenML task, and each feature, a dataset descriptor. The original aim of the study was to predict the area under the ROC (AUC). Therefore, in total, we produced 53 meta-datasets with a diverse number of OpenML tasks, ranging from above 100 to about 250.

Files

Steps to reproduce

From OpenML we retrieved data from an earlier meta-learning study (Details can be found on https://www.openml.org/s/7). Although we had to exclude a few tasks and algorithms because they lacked sufficient evaluations in OpenML, this yielded a set of 10840 evaluations on 351 tasks (datasets) and 53 machine learning methods (called flows on OpenML) from mlr (Bischl et al., 2016). From each task, 21 dataset descriptors were extracted, such as the number of examples, number of missing values, and percentage of numeric features. We formed meta-datasets, one for each machine learning method. An observation within a meta-dataset represents an original OpenML task, and each feature, a dataset descriptor. The original aim of the study was to predict the area under the ROC (AUC). Therefore, in total, we produced 53 meta-datasets with a diverse number of OpenML tasks, ranging from above 100 to about 250.

Institutions

Liverpool John Moores University

OpenML study 7 - meta-datasets

Description

Files

Steps to reproduce

Institutions

Categories

Licence