Dataset for classifying English words into difficulty levels by undergraduate and postgraduate students

Published: 10 May 2023| Version 1 | DOI: 10.17632/p2wrs7hm4z.1
Nisar Kangoo


The dataset has a total of 5372 unique words. The words marked as difficult at level 1 are 691; at level 2, they are 141; and all remaining words, viz., 4541, are easy and hence have difficulty level 0. The words are labeled "level 2" if they are difficult for post-graduate students, and "level 1" if they are difficult for undergraduate students. The words are labeled "level 0" if they are neither difficult for undergraduate students nor postgraduate students. The data is collected from the students of Jammu and Kashmir (a Union Territory of India). Latitude and Longitude (32.2778° N, 75.3412° E) The ALL UG-PG word file contains the text from the IGNOU university English text book paragraphs used in questionnaires for collecting data. The dataset_level CSV file is the original dataset. The dataset_numerical CSV file contains the original dataset along with string fields transformed into numerical.



Lovely Professional University Faculty of Technology and Sciences


Machine Learning, Computer