Amharic text dataset extracted from memes  for hate speech detection or classification

Name: Amharic text dataset extracted from memes for hate speech detection or classification
Creator: Mequanent Degu
Published: 2023-06-08T06:48:22.528Z
Keywords: Natural Language Processing, Data Mining, Machine Learning Algorithm, Deep Learning

Degu, Mequanent

doi:10.17632/gw3fdtw5v7.2

Amharic text dataset extracted from memes for hate speech detection or classification

Published: 8 June 2023| Version 2 | DOI: 10.17632/gw3fdtw5v7.2

Contributor:

Mequanent Degu

Description

the dataset is collected from social media such as facebook and telegram. the dataset is further processed. the collection are orginal_cleaned: this dataset is neither stemed nor stopword are remove: stopword_removed: in this dataset stopwords are removed but not stemmed and in stemed datset is stemmed and stopwords are removed. stemming is done using hornmorpho developed by Michael Gesser( available at https://github.com/hltdi/HornMorpho) all datasets are normalized and free from noise such as punctuation marks and emojs.

Files

Institutions

Debre Markos University College of Technology

Amharic text dataset extracted from memes for hate speech detection or classification

Description

Files

Institutions

Categories

Licence