Bangla Audio Dataset: Original and DeepFake Voices for AI-Based Voice Analysis and Detection

Published: 11 June 2024| Version 3 | DOI: 10.17632/4ftmwt86vr.3
Md Akteruzzaman Dipto,


Total number of audios : 4500 (Original: 2250; Fake: 2250) Total number of individuals: 75 Total number of distinct sets: 15 Total number of distinct sentences per set : 30 Each set spoken by the number of individuals: 5 Each sentence voice time: 2 to 5 seconds Male-Female ratio: Nearly Balanced It has a parent folder named ‘DATASET’ under which there are 75 folders with unique names to identify the characteristics, and under this there are two folders named ‘Real’ and 'Fake', which include 30 voices each named numerically. **Dataset Usage Agreement (DUA)** 1. Grant of Use 1.1 The Provider grants the User a non-exclusive, non-transferable license to use the dataset solely for academic and research purposes. 1.2 The User agrees to use the dataset only for purposes consistent with this Agreement and will not use the dataset for any commercial purposes without the prior written consent of the Provider. 2. Data Security and Privacy 2.1 The User shall ensure that the dataset is stored securely and is not accessible to unauthorized individuals. 2.2 The User agrees to comply with all applicable data protection and privacy laws in relation to the use of the dataset. 3. Attribution 3.1 The User agrees to provide appropriate acknowledgment to the Provider in any publications, presentations, or other outputs that utilize the dataset. The acknowledgment should include citation. 4. Restrictions on Use 4.1 The User shall not distribute, share, sell, or sublicense the dataset to any third party without the prior written consent of the Provider. 4.2 The User shall not attempt to re-identify any individuals from the dataset. 4.3 The User shall not use the dataset to develop any technologies or applications that are intended to be used for malicious purposes, including but not limited to, creating or distributing deepfakes. 5. Intellectual Property 5.1 The Provider retains all rights, title, and interest in and to the dataset, including any intellectual property rights. 5.2 The User agrees not to claim ownership of the dataset or any derivative works based on the dataset. 6. Liability 6.1 The user acknowledges that the dataset is a research tool provided for academic purposes and assumes full responsibility for the use of the dataset. 6.2 The Provider shall not be liable for any damages arising from the use of the dataset. 7. Termination 7.1 The Provider reserves the right to terminate this Agreement at any time if the User breaches any terms of this Agreement. 7.2 Upon termination, the User agrees to destroy all copies of the dataset in their possession. By using the dataset, the User acknowledges that they have read, understood, and agreed to be bound by the terms of this Agreement. Contact info: Sumaiya Akhtar Mitu



University of Asia Pacific


Audio Analysis, Deepfake