Dataset of Computer Science Course Queries from Students: Categorized and Scored According to Bloom's Taxonomy

Published: 5 November 2024| Version 4 | DOI: 10.17632/w5zt9n6vsc.4
Contributors:
Khandoker Ashik Uz Zaman,
,
,

Description

This dataset consists of 4 .csv files - 1. Data_Structure.csv 2. Introduction_to_Computers_and_Research.csv 3. Irrelevant_Questions.csv. 4. Blooms_Taxonomy.csv The first 3 files consists of questions asked by students of Independent University, Bangladesh on the Summer 2023 Semester in Computer Science Courses. The last file contains the Bloom's Taxonomy keywords that were used for the question evaluation. This dataset was created to highlight the usage of AI in educational context. It particularly focuses on questions as our goal was to revive the innate curiosity of students to learn through querying. Computer science (CS) courses have been particularly focused as we found higher dropout rates in CS courses compared to others. The questions in this dataset have been manually pre-processed and categorized according to their course and topics. They have also been scored using Bloom's taxonomy's six levels of questions [remember (10 points), understand (20 points), apply (20points), analyze (25 points), evaluate (35 points), create (40 points)]. Any question above 100 points is considered a high level question and the maximum attainable score for a single question is 150 points. File-1 consists of the scored and categorized questions from the "Data Structure" course. File-2 consists of the scored and categorized questions from the "Introduction to Computers and Research" course. File-3 consists of the irrelevant questions which do not belong to the courses above but were asked by the students from those courses. File-4 consists of the keywords of Bloom's Taxonomy used to evaluate the questions in this dataset.

Files

Steps to reproduce

The data was gathered using an online application. The application was used in the classroom by students from Independent University, Bangladesh (IUB) on the Summer 2023 semester. The application was experimented on the "Data Structure" and "Introduction to Computers and Research" course. The questions have been pre-processed by fixing grammar, punctuations, spelling mistakes and capitalizations. After pre-processing similar questions have been removed using Levenshtein Distance with an 80% similarity. It was then evaluated using Bloom's Taxonomy keywords and a score was assigned to each question. Finally, the questions were manually categorized according to their topics and courses. A keyword-based detection was used to score the questions. The keywords for Bloom's taxonomy was compiled from multiple sources and in cases where the same keyword appeared at different levels in the taxonomy, we prioritized the level where it was most commonly associated.

Institutions

Independent University

Categories

Education, Student Learning, Student Cognition, Student Development, Student Performance

Licence