A Binary Classification Dataset for Allergy Status Prediction among University Students
Description
This dataset contains survey-based information on allergy-related factors among university students of Daffodil International University, Bangladesh. The data were collected through a structured questionnaire between September 2025 and March 2026. The main purpose of the dataset is to support the analysis of allergy-associated demographic, environmental, medical history, and lifestyle factors among university students and to develop supervised machine learning classification models for predicting allergy status. File Information File name: University Student Allergy Dataset.csv File type: CSV Number of records: 854 Number of variables: 17 Target variable: target_class Task type: Binary classification Data source: Structured survey of students from Daffodil International University, Bangladesh Data collection period: September 2025 to March 2026 The dataset contains the following variables: gender: Gender of the student; values include Male and Female. age: Age of the student in years. weight_kg: Body weight of the student measured in kilograms. height_ft: Height of the student measured in feet. pet_home: Indicates whether the student has a pet at home. pet_type: Type of pet owned, such as Cat, Dog, Bird, Fish, Cow, Rabbit, Hen, or No pet. residence: Type of residential environment, including Urban, Suburban, Rural village, and Industrial area. mold_env: Indicates exposure to mold in the living environment. doctor_cond: Doctor-diagnosed allergy-related condition, including Allergic Rhinitis, Food Allergy, Drug Allergy, Asthma, Atopic Dermatitis, or No Allergy. family_allergy: Family history of allergy-related conditions. rhinitis_diag: Indicates whether the student has been diagnosed with rhinitis. rhinitis_duration(years): Duration of rhinitis in years. trigger_symptom: Indicates whether the student experiences allergy symptoms due to specific triggers. smoke_now: Current smoking status of the student. cigs_per_day: Number of cigarettes smoked per day. This variable may contain outliers or noisy entries and may require preprocessing before analysis. secondhand_smoke: Indicates exposure to secondhand smoke. target_class: Allergy status of the student; Positive indicates allergy presence and Negative indicates no allergy. Methodology / Data Collection Method Data were collected using a structured survey questionnaire administered to students of Daffodil International University, Bangladesh. The questionnaire was designed to capture multiple factors potentially associated with allergy status, including demographic characteristics, environmental exposure, medical history, family allergy background, rhinitis-related information, smoking behavior, and secondhand smoke exposure. Each response was recorded as an individual observation in the dataset. The collected data were organized into categorical and numerical attributes and prepared in CSV format for statistical analysis and machine learning applications.
Files
Institutions
- Daffodil International UniversityDhaka Division, Dhaka