Turkish Sentence Dataset for Word Partitioning

Published: 4 September 2024| Version 1 | DOI: 10.17632/wztzshk325.1
Contributor:
Hayri Volkan Agun

Description

The dataset contains the selected sentences for the appearing words in the datasets of analogy, NER, POS, and Sentiment analysis. Each dataset is also given in the folder. The main.txt is the selected sentences that represent the words in the dataset. The dataset is named as train.txt for extrinsic tasks and sentence-tr.json for analogy task. The analogy task contains all the word analogies in JSON format.

Files

Institutions

Bursa Teknik Universitesi

Categories

Word Processing, Natural Language Processing, Turkish Language, Word Embedding

Licence