Study habits and artificial intelligence use among university students: A proportionally stratified survey dataset

Published: 18 March 2026| Version 1 | DOI: 10.17632/mcwb2ppdsw.1
Contributor:
Cristian Correa

Description

This repository contains a survey dataset on study habits and artificial intelligence (AI) use among undergraduate students at the Manizales campus of a Colombian public university. The data were collected during the first academic term of 2025 using a proportional allocation across 14 academic programs. Within each program, data collection followed a quota-based approach: responses were collected until the target number of students for that program was reached. The final dataset includes 357 observations and preserves the intended distribution of the sample across programs. The repository includes both the original and processed versions of the data. The file “survey_raw_spanish_anonymized.csv” contains the original questionnaire responses in Spanish, with identifying timestamp detail removed. The file “survey_cleaned_english.csv” provides a cleaned analytical version with English variable names, harmonized categorical responses, a cleaned GPA variable, a standardized AI tool field, and a derived count of reported AI tools. Additional files support interpretation and reuse. The “sampling_frame_by_program.csv” file documents the population and proportional allocation by academic program. The “codebook.csv” provides variable definitions, response formats, and missing-value conventions. The “questionnaire_bilingual.csv” contains the survey instrument in Spanish and English, organized by thematic sections and linked to coded items. The “analysis_script.R” file reproduces the data cleaning, transformation, and descriptive analysis steps. The dataset includes variables on sociodemographic characteristics, study habits, academic self-perception, AI tool use, perceived usefulness, academic integrity, perceived dependence, creativity, and prompt engineering knowledge. Most variables are ordinal categorical, making the dataset suitable for descriptive and exploratory analysis. This dataset is intended for secondary use in educational research, including studies of study habits, AI adoption in higher education, academic integrity, and cross-program comparisons. It is based on self-reported responses from a single campus and follows a cross-sectional design; therefore, it should not be used to draw causal conclusions.

Files

Categories

Artificial Intelligence, Education, Higher Education

Licence