MFFCs for Multi-class Human Action Analysis : A Benchmark Dataset

Published: 25 July 2023| Version 1 | DOI: 10.17632/6ng2kgvnwk.1


This dataset embodies a diverse collection of Mel Frequency Cepstral Coefficients (MFCCs) corresponding to various human actions. The MFCCs are time-frequency representations of audio signals that reflect the power spectrum of an audio signal, processed through a Mel scale filter bank to mimic human auditory perception. In this dataset, the audio signals are associated with different human actions, including walking, running, jumping, and dancing. The MFCCs are calculated through a series of signal processing stages, including application of the Fourier Transform, Mel scale transformation, and Discrete Cosine Transform. Each MFCC representation encapsulates a segment of the corresponding audio signal. The dataset is meticulously designed for tasks such as human action recognition, classification, segmentation, and detection. It serves as a potent instrument for training and assessing machine learning models that interpret human actions based on audio signals. The dataset is particularly useful for researchers and practitioners in the fields of signal processing, computer vision, and machine learning, who endeavor to construct algorithms for human action analysis through audio signals. Importantly, the dataset comes annotated with labels specifying the type of human action represented in each MFCC. This label information facilitates a supervised learning framework, which is essential for the development and evaluation of predictive models.


Steps to reproduce


Edith Cowan University, University of Western Australia


Computer Vision Representation, Benchmarking, Multimodality, Image Analysis, Action Recognition


Higher Education Commission, Pakistan


Office of National Intelligence, Government of Australia