MirathQA: An Arabic Dataset for Islamic Inheritance Reasoning

Published: 29 January 2026| Version 4 | DOI: 10.17632/7jhycpbdpw.4
Contributors:
Ameera Almasoud,
, Reem Alqifari, Noof Alfear,

Description

The repository contains three processed files (train, val, and test) and (Mirath Dataset_final) the raw dataset, which includes the original collected cases used to generate the multiple-choice questions (MCQs). The scripts folder contains all the scripts that can be used to reproduce the dataset.

Files

Steps to reproduce

The repository is organized into three main folders. The first folder, mirath_dataset, contains the original curated inheritance cases used to generate the multiple-choice questions, along with the derived question–answer sheets, provided in both .xlsx and .csv formats. The second folder (mirath_splits_70_15_15_stratified) includes three subfolders (train, val, and test), each containing the corresponding subsets of the multiple-choice question sheets (A–F) according to the predefined data splits provided in both .xlsx and .csv formats. The Scripts folder contains a collection of Python scripts that implement data integrity verification, automated Mirath MCQ generation, and stratified data splitting and validation.

Institutions

Categories

Natural Language Processing, Benchmarking, Inheritance, Reasoning, Large Language Model

Licence