Arabic Natural Audio Dataset

Name: Arabic Natural Audio Dataset
Creator: Samira klaylat
Published: 2018-05-30T14:30:16.925Z
Keywords: Speech Analysis, Arabic Language, Emotion Perception

klaylat, Samira; Osman, ziad; Zantout, Rached; Hamandi, Lama

doi:10.17632/xm232yxf7t.1

Arabic Natural Audio Dataset

Published: 30 May 2018| Version 1 | DOI: 10.17632/xm232yxf7t.1

Contributors:

Samira klaylat, ziad Osman, Rached Zantout, Lama Hamandi

Description

This is the first Arabic Natural Audio Dataset (ANAD) developed to recognize 3 discrete emotions: Happy,angry, and surprised. Eight videos of live calls between an anchor and a human outside the studio were downloaded from online Arabic talk shows. Each video was then divided into turns: callers and receivers. To label each video, 18 listeners were asked to listen to each video and select whether they perceive a happy, angry or surprised emotion. Silence, laughs and noisy chunks were removed. Every chunk was then automatically divided into 1 sec speech units forming our final corpus composed of 1384 records. Twenty five acoustic features, also known as low-level descriptors, were extracted. These features are: intensity, zero crossing rates, MFCC 1-12 (Mel-frequency cepstral coefficients), F0 (Fundamental frequency) and F0 envelope, probability of voicing and, LSP frequency 0-7. On every feature nineteen statistical functions were applied. The functions are: maximum, minimum, range, absolute position of maximum, absolute position of minimum, arithmetic of mean, Linear Regression1, Linear Regression2, Linear RegressionA, Linear RegressionQ, standard Deviation, kurtosis, skewness, quartiles 1, 2, 3 and, inter-quartile ranges 1-2, 2-3, 1-3. The delta coefficient for every LLD is also computed as an estimate of the first derivative hence leading to a total of 950 features.

Arabic Natural Audio Dataset

Description

Files

Categories

Licence