FingerprintHarGasFacialDataset

Published: 09-03-2021| Version 1 | DOI: 10.17632/vpjt5v26tc.1
Contributors:

Description

This dataset has 116 rows, each one representing one dataset from a set of several public datasets, taken from several different domains (Human Activity Recognition, Gas analysis, Facial expressions recognition using Kinect), so it is actually a meta-data set (a dataset of datasets) as well as different sources: - University of Texas Multimodal Human Action Dataset - Opportunity Activity Recognition - Physical Activity Monitoring for Aging People Version 2 - Mobile Health data set - Daily and Sports Activities - Human Activities and Postural Transitions - Gas Sensor Array Drift - Grammatical Facial Expressions Each of the columns is a feature taken from one of the above mentioned 3 domains: - The first 9 (F0-F8) are taken from the HAR domain - The next 3 (F9-F11) come from the Gas domain - The last 3 (F12-F14) come from the Face expressions domain Notice that there are more rows than public datasets we compiled because for some of them we took pairs of sensors as a different dataset. The label (last column) is the fusion architecture that performed best, from the following ones: 1.- Aggregation of features 2.- Vote with shuffled features 3.- Voting 4.- Voting with CART, LR, and RCF for all features 5.- Multi-View Stacking with shuffled features 6.- Multi-View Stacking 7.- Multi-View Stacking with CART, RCF, and LR 8.-AdaBoost When we found no statistically significant difference between the default method, which was aggregation, then we took this one.

Files

Steps to reproduce

1.- Compile the original data from the above mentioned public datasets (this requires some preprocessing, different for each dataset, that we cannot include here, please refer to Antonio Aguileta Ph.D. document: "Predicting the best sensor fusion method for recognizing human activity using a machine learning approach based on a statistical signature meta-data set and its generalization to other domains", Tecnologico de Monterrey, Mexico, 2020. 2.- Use the SFFS algorithm to find the 9, 3, and 3 best features for the domains included in this meta-dataset. 3.- Put 0 in all columns from a domain other than the one the considered row refers to.