Bangladesh Sangbidhan Question Answering Dataset for Natural Language Processing (NLP)
Description
The Bangladesh Sangbidhan Question Answering Dataset is designed for the development and evaluation of Natural Language Processing (NLP) models, specifically for question answering tasks. It contains a collection of questions based on the Constitution of Bangladesh, along with their corresponding answers extracted from the text of the Constitution. This dataset aims to facilitate advancements in machine learning models, particularly in NLP applications related to legal and governmental text.
Files
Steps to reproduce
1.Collect Text: Get the full text of the Constitution of Bangladesh. 2.Generate Questions: Create questions based on the Constitution. 3.Extract Answers: Manually or automatically extract answers. 4.Format Dataset: Organize questions, answers, and context in CSV/JSON format. 5.Validate Data: Ensure accuracy and consistency. 6.Annotate (Optional): Categorize and label questions.
Institutions
- Daffodil International University