Historical Arabic Handwritten Text Recognition Dataset
Published: 23 January 2024| Version 1 | DOI: 10.17632/xz6f8bw3w8.1
Contributors:
, Description
A collection of rich historical Arabic text, spanning different geographies across centuries, is present in this dataset. Experts have meticulously transcribed forty historical pages, each five from a distinct book, providing the textual ground truth for each image. No data as such has been made available publicly previously, up to our knowledge. This intends to contribute to deep learning OCR modeling and testing by practitioners and researchers interested in Arabic OCR and correction.
Files
Institutions
Islamic University in Madinah
Categories
Artificial Intelligence, Computer Vision, Document Analysis, Optical Character Recognition, Handwriting Recognition, Natural Language Processing, Arabic Language, Historical Analysis, Deep Learning, Language Modeling, Applied Machine Learning
Funding
This research was funded by the Deputyship of Research and Innovation, Ministry of Education, Saudi Arabia (Project #964). In addition, the authors would like to express their appreciation for the support provided by the Islamic University of Madinah.
964