A Structured Bangla Dataset of Disease-Symptom Associations to Improve Diagnostic Accuracy
Description
The dataset is structured in a tabular format and consists of disease-symptom relationships. it is organized as follows: the first column represents diseases, the remaining columns represent symptoms. Each cell contains a binary value (1 or 0), where 1 indicates that the symptom is associated with the disease and 0 indicates no association. There are 85 Unique Diseases, 172 Symptoms along with 758 Disease-Symptoms Relations. To use the dataset, please cite the following: R. Zannat, A. Al Shafi and A. Muntakim, "Bridging the Gap in Bangla Healthcare: Machine Learning Based Disease Prediction Using a Symptoms-Disease Dataset," 2025 International Conference on Electrical, Computer and Communication Engineering (ECCE), Chittagong, Bangladesh, 2025, pp. 1-6, doi: 10.1109/ECCE64574.2025.11012950. URL: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11012950&isnumber=11012919