AHD: Arabic Healthcare Dataset
Description
- Numerous language-centric research on healthcare is conducted day by day. To address shortcomings of Arabic natural language generation models, we introduce a large Arabic Healthcare Dataset (AHD) of textual data. For this motivation, we named our dataset ‘AHD’. - The largest Arabic Healthcare Dataset (AHD) as we know was collected from altibbi website. - The AHD consists of more than 808k Question and Answer into 90 variety categories. The AHD contains one file, and the file description will be discussed here. One file is the actual data which is in Arabic language. - AHD.xlsx file contains dataset in excel format, which includes the question, answer, and category in Arabic. - AHD_english.xlsx file contains dataset in excel format, which includes the question, answer, and category translated to English. - Distribution of Question and Answer per category.xlsex shows the distribution of the data set by category.