Unlabelled Arabic Speech Dataset

Published: 18 September 2023| Version 1 | DOI: 10.17632/yhy76b2c7b.1
Noora Al Roken


Speech is crucial for communication, especially in the era of artificial intelligence. Many fields, including security and forensics, rely on analyzing speech, where it can be used to identify people, but there's a challenge when it comes to Arabic language data. Most speech analysis models are primarily trained on English data and don't work well with Arabic. Therefore, a large dataset with over a million Arabic samples was built for training models and improving performance. The samples were collected from Arabic podcasts, so they're clear and noise-free. The audio recordings were extracted and segmented into one-second segments. Then, they were transformed into Mel-spectrograms, using a 22 kHz sampling rate, a frame size of 2048 samples, and a hop length of 512 samples. Researchers working with Arabic speech data can benefit from this dataset to try different machine learning and deep learning models. Using our dataset to improve Arabic speaker identification, we trained a Siamese speech embedding model. We tested its performance using a benchmark dataset, which can be viewed in the paper titled "Unsupervised Arabic Speech Embedding Model for Speaker Identification" https://ieeexplore-ieee-org.aus.idm.oclc.org/document/10191576.



Signal Processing, Speaker Recognition, Arabic Language, Audio Recognition, Deep Learning, Deep Transfer Learning, Transfer Learning