AHD: Arabic Healthcare Dataset

Published: 11 July 2024| Version 5 | DOI: 10.17632/mgj29ndgrk.5
Contributor:
Hezam Gawbah

Description

- Numerous language-centric research on healthcare is conducted day by day. To address shortcomings of Arabic natural language generation models, we introduce a large Arabic Healthcare Dataset (AHD) of textual data. For this motivation, we named our dataset ‘AHD’. - The largest Arabic Healthcare Dataset (AHD) as we know was collected from altibbi website. - The AHD consists of more than 808k Question and Answer into 90 variety categories. The AHD contains one file, and the file description will be discussed here. One file is the actual data which is in Arabic language. - AHD.xlsx file contains dataset in excel format, which includes the question, answer, and category in Arabic. - AHD_english.xlsx file contains dataset in excel format, which includes the question, answer, and category translated to English. - Distribution of Question and Answer per category.xlsex shows the distribution of the data set by category.

Files

Institutions

Ibb University

Categories

Medical Assistant, Natural Language Processing, Arabic Language, Healthcare Research, Natural Language Generation, Text Processing, Deep Learning, Natural-Language Understanding, Chatbot

Licence