Turkish Sentence Dataset for Word Partitioning
Published: 4 September 2024| Version 1 | DOI: 10.17632/wztzshk325.1
Contributor:
Hayri Volkan AgunDescription
The dataset contains the selected sentences for the appearing words in the datasets of analogy, NER, POS, and Sentiment analysis. Each dataset is also given in the folder. The main.txt is the selected sentences that represent the words in the dataset. The dataset is named as train.txt for extrinsic tasks and sentence-tr.json for analogy task. The analogy task contains all the word analogies in JSON format.
Files
Institutions
Bursa Teknik Universitesi
Categories
Word Processing, Natural Language Processing, Turkish Language, Word Embedding