Istidlal: A Dataset for Benchmarking Logical and Sequential Economic Reasoning in Arabic LLMs
Description
This repository contains the data and resources associated with Istidlal, the first Arabic benchmark for evaluating logical and sequential reasoning in Arabic Large Language Models (LLMs) within the economic and financial domain. The repository includes two main assets: 1. Arabic Economic and Financial News Corpus (AEFNC) A curated corpus of 5,450 Arabic news articles collected from four reputable financial sources: Al-Eqtisadiah, Asharq Business, CMA Financial News, and Alsayrfah. The corpus covers economic and financial topics including Islamic finance and Sharia-compliant governance, spanning the period 2010–2025. 2. Istidlal Benchmark A set of 1,000 validated reasoning scenarios derived from the AEFNC, provided under two complementary task setups: (i) MCQ (Multiple-Choice Questions): 500 items targeting recognition-based sequential reasoning, where the model selects the correct event ordering from four candidate options. (ii) OOQ (Open-Order Questions): 500 items targeting generative sequential reasoning, where the model constructs the correct event sequence from scratch.
Files
Steps to reproduce
The repository includes a `Scripts/` folder containing all code used to reproduce this work: prompts for article classification, question generation, and validation.