Additional Hindi Words for IndoWordNet

Published: 23 May 2022| Version 1 | DOI: 10.17632/db8sh8js67.1
Contributors:
Dr Milind Audichya,

Description

This dataset contains a list of the 13953 missing words from the Hindi WordNet. Between December 2017 and April 2022, IndoWordNet was used to conduct this research on the 5011 Unicode Transformation Format (UTF-8) based on Hindi Verses and Poetry data. As Hindi is a language with many dialects so that is also one of the reasons for this number of missing words. Some of the words were not found in the Hindi WordNet because they were borrowed words, newly developed terminologies, misspelled, or combined words. This list can help improve the Hindi WordNet by carefully processing these words through the standard WordNet new words inclusion process.

Files

Categories

Linguistics, Literature, Computer Science, Artificial Intelligence, Computational Linguistics, Ontology, Natural Language Processing, Text Extraction, Hindi Language, Text Processing, Word List, Language

Licence