Amharic text dataset extracted from memes for hate speech detection or classification
Published: 8 June 2023| Version 2 | DOI: 10.17632/gw3fdtw5v7.2
Contributor:
Mequanent DeguDescription
the dataset is collected from social media such as facebook and telegram. the dataset is further processed. the collection are orginal_cleaned: this dataset is neither stemed nor stopword are remove: stopword_removed: in this dataset stopwords are removed but not stemmed and in stemed datset is stemmed and stopwords are removed. stemming is done using hornmorpho developed by Michael Gesser( available at https://github.com/hltdi/HornMorpho) all datasets are normalized and free from noise such as punctuation marks and emojs.
Files
Institutions
Debre Markos University College of Technology
Categories
Natural Language Processing, Data Mining, Machine Learning Algorithm, Deep Learning