Corpus Minangkabau

Published: 27 February 2023| Version 1 | DOI: 10.17632/kch8f4smtw.1
Fadhli Almuiini Ahda,
, danang arbian


In the development of language technologies such as machine translation, speech recognition, and others, the language corpus is very important as a source of training data. By having a corpus of Minangkabau and Indonesian languages, language technology developers can build models and systems that are more accurate and effective. Data for the corpus is collected from a variety of sources, including a number of websites and books that provide information in Minangkabau and Indonesian. Up to 520 Minangkabau and Indonesian sentences have been compiled in the data you want to publish, all of which are presented as rhymes and poetry.



Universitas Negeri Malang


Artificial Intelligence, Natural Language Processing, Machine Translation