A Dataset for Voice-Based Human Identity Recognition

Published: 31 March 2022| Version 2 | DOI: 10.17632/zw4p4p7sdh.2
Baha' Alsaify


This dataset is divided into two main sub-datasets: samePhrase and differentPhrase. Each speaker has the same label in both sub-datasets. In the samePhrase sub-dataset, a speaker repeats the sentence “Machine Learning 1, 2, 3, 4, 5, 6, 7, 8, 9, 10” ten times. The length of each sample is between seven and ten seconds. For the differentPhrase sub-dataset, each speaker contributed with a phrase selected randomly from different resources such as books, songs lyrics, orone-line texts. Each speaker contributed with ten different samples, the length of each sample inthe differentPhrase sub-dataset does not exceed ten seconds



Jordan University of Science and Technology


Machine Learning, Human Identification, Human Voice