Afaan Oromo Text to Speech Synthesis dataset

Published: 28 December 2022| Version 1 | DOI: 10.17632/mpy85ns82z.1
chala sembeta


Afaan Oromo Text to Speech Synthesis dataset is a public domain speech dataset consisting of 8,076 short audio clips of a single male speaker reading sentences collected from legitimate sources such as News Media sources, Non-fiction books, and Afaan Oromo Holy bible. A transcription and its normalized text are provided for each clip. After two weeks of the audio recording process, a total of 17 hours of recorded speech data that corresponded to a total of 8076 recorded .wav files was created. File Format Metadata is provided in metadata.csv. This file consists of one record per line, delimited by the pipe character. The fields are: ID: this is the name of the corresponding .wav file Transcription: words spoken by the reader (UTF-8) Normalized Transcription: transcription with numbers, ordinals, and monetary units expanded into full words (UTF-8). Each audio file is a single-channel 16-bit PCM WAV with a sample rate of 22050 Hz.



Adama Science and Technology University, Mettu University


Natural Language Processing, Text-to-Speech, Speech Synthesis