New Common Swahili Typos Dataset

Published: 16 September 2024| Version 1 | DOI: 10.17632/b999rv86dr.1
Contributors:
Vicent Wilson, Jeremiah Challe

Description

New updated list of Swahili typos with their respective proper words, the dataset is lowercased and comma separated. typos will be replaced by its respective proper word so as to maintain consistent during vectorization to form vectors that are used for machine learning tasks

Files

Institutions

University of Dar es Salaam College of Informatics and Communications Technologies

Categories

Artificial Intelligence, Natural Language Processing, Machine Learning, Deep Learning

Licence