Datasets of Virtual Classrooms for Predicting Students' Performance (University of Extremadura, Spain)

Published: 31-07-2020| Version 2 | DOI: 10.17632/73x6vdd8bh.2
Juan A. Gomez-Pulido


We built eight datasets to perform experiments about predicting students' performance. These datasets were prepared from virtual classrooms of a series of university-level courses, whose data were adequately filtered to remove atypical cases. The purpose of preparing this suite of datasets was having different cases for different academic contexts: number of students and tasks, degree levels (first, middle or final years), academic nature (theoretical or practical contents), etc. This heterogeneous suite also contributes to considering a different number of latent factors. The virtual classrooms of the online campus of the University of Extremadura (UEX) provided the necessary data for the datasets. The datasets are identified by the number of students (S), number of academic/assessed tasks (I), scores/performance matrix (P), known (D^{knw}), unknown (D^{unk}), training (D^{train}) and test (D^{test}) performances. The datasets are provided as Matlab .mat files. Also, they are described in Microsoft Excel sheets.