Kazakhstan Teacher Workforce Dynamics Dataset (2015–2024)
Description
The dataset contains detailed information on teacher workforce dynamics in Kazakhstan from 2015 through 2024. It includes annual measures of teacher exits, turnover, retention, and cohort survival rates. The data cover the entire population of public-school teachers across 4,919 settlements, 212 districts (raions) and 20 administrative regions of Kazakhstan. The dataset was derived from the Kazakhstan National Education Database (NEDB), which provides official administrative records of schools, students and teachers. The dataset is intended to support research on workforce planning, teacher mobility, education policy, and long-term staffing needs. – This dataset consists of nine structured tables derived from aggregated teacher-level records for the period 2015–2024. The tables are provided in .csv format and can be found in the Dataset/tables folder. Each table presents indicators such as exit counts, retention rates, turnover rates, and cohort-based retention patterns. – A detailed variable description is available in the accompanying Dataset/tables/Codebook.xlsx, which explains all geographic identifiers, demographic categories, and indicator definitions. Each table is linked to a visualization (bar charts, line graphs, heatmaps) that highlights the main workforce dynamics by year, region, or demographic group. These figures are stored in the Dataset/data/visualizationsfolder and correspond to the table numbering for easy cross-reference. – The Dataset/code folder contains the Python scripts (.ipynb) used for data cleaning, aggregation, and indicator calculation, along with a README.txt file documenting the workflow and formulas used. The dataset enables users to investigate teacher retention and turnover across time, geographic scales (settlements, districts, oblasts), and age groups (<30, 30–39, 40–49, 50+). Researchers can replicate the analysis or extend it using the provided scripts. All data are anonymized and aggregated to protect individual confidentiality.
Files
Steps to reproduce
All steps in data processing that led to this dataset being generated are documented in the provided README.txt file. To replicate the dataset, you can open the 'Dataset/code/main.ipynb' inside the repository. This script reads in the raw data (which is not shared here), conducts the analysis and aggregation, and exports the output files as .csv and .png files. To reproduce the data in full, access to the raw original data from the Kazakhstan National Education Database (NEDB) is necessary.
Categories
Funders
- Committee of Science of the Ministry of Science and Higher Education of the Republic of KazakhstanGrant ID: Grant No. BR24993019