Ge’ez Language Homophonic Words Sense Disambiguation (WSD) Dataset : Contextual Word Disambiguates of Ge’ez Language with Homophonic Using Machine Learning

Published: 13 March 2024| Version 1 | DOI: 10.17632/3m878pzf7j.1
Mequanent Degu Belete, Tigist Bezabh


This dataset has three columns: the first column is named as text, the second column is named as class and the third column is named as homophonic word. The text column contains 1010 text samples for 10 pairs of homophonic Ge'ez words: ነስሐ and ነስኀ, ሐየሰ and ኀየሰ, ጸመመ and ፀመመ, ቀሰመ and ቀሠመ, ሐደመ and ሀደመ, መሀረ and መሐረ, ኀለየ and ሐለየ, መልአ and መልዐ, and ፈጸመ and ፈፀመ, ሠርሐ and ሰርሐ. The sample is a sentence that contains homophonic words. The class column contained the contextual meaning(sense) of the homophonic word in the given sample. The homophonic column contains the identified homophonic word in the given sample. The contextual meaning of words is determined by based on Akalewold Kiflie dictionary



Debre Markos University College of Technology


Linguistics, Machine Learning