Amharic corpus
Published: 8 November 2021| Version 1 | DOI: 10.17632/dtywyf3sth.1
Contributor:
Seid Muhie YimamDescription
This is an Amharic text corpus used to build different semantic models. Details about the dataset collection, different semantic models built with the dataset, and classification models are described in this paper "Introducing Various Semantic Models for Amharic: Experimentation and Evaluation with Multiple Tasks and Datasets" (https://www.mdpi.com/1999-5903/13/11/275)
Files
Steps to reproduce
Text is collected using different scrapping technologies
Institutions
Universitat Hamburg
Categories
Language Modeling, Corpus Analysis, Word Embedding