BAAD16: Bangla Authorship Attribution Dataset

Published: 14 July 2020| Version 4 | DOI: 10.17632/6d9jrkgtvv.4
Contributors:
Aisha Khatun, Anisur Rahman, Md. Saiful Islam

Description

A dataset with sample Bangla texts from 16 authors containing a total of 13.4+ million words. The dataset was equally partitioned with each document having the same length of 750 words.

Files

Institutions

  • Shahjalal University of Science and Technology

Categories

Natural Language Processing, Bengali Language, Attribution

Licence