Dataset Sirah Nabawiyah - Part 2

Published: 23 April 2026| Version 1 | DOI: 10.17632/zvn9cjn8nh.1
Contributors:
,

Description

The Sirah Nabawiyah Dataset – Part 2 is a structured collection of textual data that focuses on the later phases of the life of Prophet Muhammad (peace be upon him). This dataset is intended to support research and development in fields such as natural language processing (NLP), text mining, historical analysis, and educational applications. The dataset typically contains narratives, events, and documented accounts related to significant moments after the early period of Islam. These may include major battles (such as Uhud and Khandaq), treaties (e.g., Hudaybiyyah), the conquest of Mecca, the spread of Islam, and the Prophet’s final years. Each entry is usually organized in a systematic format, which may include identifiers, titles, chronological markers, and detailed descriptions of events. From a technical perspective, the dataset can be utilized for various tasks such as text classification, clustering, topic modeling, summarization, and sequence-based learning. It provides rich contextual information and temporal progression, making it suitable for both supervised and unsupervised machine learning approaches. Overall, this dataset serves as a valuable resource for exploring historical narratives through computational methods while preserving the chronological and thematic structure of the Sirah Nabawiyah.

Files

Steps to reproduce

1. Environment Setup Specify exactly: Programming language (e.g., Python 3.x) Libraries (NumPy, Pandas, Scikit-learn, TensorFlow, etc.) Hardware (optional but strong: CPU/GPU) Example: Install required libraries: pandas, numpy, scikit-learn 2. Dataset Preparation Explain: Dataset used: Sirah Nabawiyah Dataset – Part 2 Format (CSV, JSON, etc.) What preprocessing you did: cleaning text tokenization stopword removal 3. Data Processing / Feature Engineering This is where most students are vague (mistake): Convert text → numerical representation: TF-IDF / Bag of Words / Embedding Normalize or scale data if needed 4. Model / Method Implementation Be explicit: If clustering: Algorithm used (e.g., K-Means) Number of clusters (k = ?) If classification: Model used (e.g., SVM, LSTM) 5. Training / Execution Explain how you run it: Split data (if applicable) Train model Run clustering / prediction 6. Evaluation This is critical: Metrics used: Clustering → Silhouette Score Classification → Accuracy, F1-score 7. Result Reproduction Explain how to get your final output: Run main script Output: cluster labels / predictions evaluation scores

Categories

Natural Language Processing

Licence