VOCHITEXT: Vocabulary Corpus of Chilean Children’s Textbooks
Description
The VOCHITEXT is a specialized corpus of 23,297 Spanish (Chilean) words, curated to represent the core vocabulary used in Chilean primary education. Derived from 58 complete texts (student books) collected in 2022, it encompasses vocabulary from science, history, language, social science, and mathematics official textbooks, spanning preschool through 4th grade. For preschool, it includes words from reading materials (children's tales) recommended by the Chilean Ministry of Education. This corpus, originally developed for the LEXIKON App project (ANID Fondef, IT21I0078, University of Concepcion) to assess children's lexical development, provides a representative sample of school vocabulary. The VOCHITEXT is a valuable resource for researchers, educators, and developers interested in Chilean primary education, vocabulary acquisition, and linguistic analysis of educational materials, offering insights into the lexical content encountered by Chilean students across different subjects and grade levels.
Files
Institutions
Categories
Funding
Agencia Nacional de Investigación y Desarrollo
IT21I0078