Textbook Dataset from NCTB

Name: Textbook Dataset from NCTB
Creator: Abdullah Khondoker
Published: 2023-09-11T17:24:10.987Z
Keywords: Natural Language Processing, Answer Extraction, Bengali Language, Reading Comprehension, Textbook, Text Comprehension, Text Processing

Khondoker, Abdullah; Ahmed, Enam; Tashik, Md. Iftekhar Islam; Mahmud, S M Ishtiak; Parsa, Antara Firuz

doi:10.17632/gktc5y2sy2.1

Textbook Dataset from NCTB

Published: 11 September 2023| Version 1 | DOI: 10.17632/gktc5y2sy2.1

Contributors:

Abdullah Khondoker, Enam Ahmed, Md. Iftekhar Islam Tashik, S M Ishtiak Mahmud, Antara Firuz Parsa

Description

In our quest to advance Bangla language processing, we have created a specialized dataset tailored to our project's objectives. This dataset is a cornerstone in developing an effective Bangla Question-Answering system with a strong emphasis on customization. It comprises approximately 3,000 meticulously curated question-and-answer pairs. Human annotators, guided by NCTB textbooks from classes six to ten, painstakingly selected these pairs. Each passage in the dataset, averaging 387 words, offers rich context for meaningful question answering. Human annotators also diligently collected responses for various question types, ensuring the dataset's reliability and relevance in Bangla. Our primary goal is to develop a proficient Bangla question-answering system. We have organized the dataset into training and validation subsets to achieve this, conveniently encapsulated within CSV files. These files seamlessly integrate multiple passages with corresponding questions and expertly annotated answers. Our dataset forms the foundation for a precision-driven, context-aware Bangla question-answering system. It serves as a vital resource for researchers and developers working to enhance Bangla language processing capabilities, poised to advance the state of the art in this field.

Files

Institutions

BRAC University Department of Computer Science and Engineering

Textbook Dataset from NCTB

Description

Files

Institutions

Categories

Licence