A harmonized longitudinal dataset of engineering students for Learning Analytics and Predictive modeling
Description
This dataset contains 1,656 student records with 56 variables, synthesized from academic administrative data at an application-oriented university. The data were compiled from multiple spreadsheet-based sources across several academic years and underwent cleaning, standardization, and harmonization to ensure consistency in naming conventions and coding structures. The dataset preserves real-world curriculum variations, including changes in course naming, sequencing, and elective pathways. It is provided in two formats: a labeled version for interpretation and a harmonized coded version for analysis. Variables include socio-demographic information, course grades, cumulative academic performance indicators, scholarship records, and academic status. The dataset can be used for learning analytics, educational data mining, machine learning applications (e.g., student performance prediction), and research on data integration and synthetic data validation.
Files
Institutions
- Hanoi University of IndustryHanoi, Hanoi