Hindko Voice Dataset (HVD)

Published: 16 October 2024| Version 1 | DOI: 10.17632/yjhz8z7mv5.1
Contributors:
,
,
,
,

Description

Hindko is a language that is mostly spoken in Northwestern areas of Pakistan. There are 8 million people that speak Hindko Language. According to their native speakers it is 7th largest language of Pakistan and 2nd Largest Language of Khyber Pakhtunkhwa. Hazara Region is cultural hub of Hindko language. About 80% of population of districts like Haripur, Abbotabad and Mansehra Speak Hindko. Speaking content of Hindko cover a wide range of subjects including religion, education, poetry, politics, theater, and many more. Despite all these Hindko need a voice recognition system that enhance accessibility, preserve the language, and include digital inclusion for its speakers. Dataset consists of 20 hindko numbers from 1 to 20. We asked every individual to speak these 20 numbers in one recording and send it on WhatsApp. Round about 300 individuals participated in this project. We have taken 3 samples from every individuals. Then we use audacity software and split every number from the recording, and saved in separa

Files

Institutions

Pak-Austria Fachhochschule Institute of Applied Sciences and Technology

Categories

Natural Language Processing, Speech Recognition, Machine Learning, Voice Recognition

Licence