QazUNTv2: Dataset of high school math problems on english and russian languages

Published: 14 August 2024| Version 1 | DOI: 10.17632/52vc6v4czj.1
Contributors:
,
,
,

Description

This dataset is intended for the subsequent verification of the correctness of LLM (GPT-3.5 Turbo) generated responses to mathematics problems similar to those found in exams for graduate schools. The dataset includes problems and their types in both Russian and English, along with five options, manually solved answers, and detailed solutions in both languages. The primary goal is to analyze and compare LLM (GPT-3.5 Turbo)-generated answers with provided correct solutions. The data has been collected and structured across the following sections of mathematics: Algebra, Probability and Logic. We ensured a comprehensive evaluation of GPT's capabilities in understanding and solving these problems. The dataset is divided into the following sections with the corresponding number of problems: 1. Algebra: 436 problems; 2. Logic: 312 problems; 3. Probability: 163 problems. This dataset will facilitate a detailed assessment of GPT's performance in mathematical problem-solving across various domains. For the future analysis, we also calculated quantity of tokens that may help to generate responses from ChatGPT-3.5 Turbo: The English math problems comprise 37563 tokens. The Russian math problems comprise 66406 tokens. The average number of tokens per English task is approximately 39.96 and for the problems in Russian this number is approximately 70.64.

Files

Steps to reproduce

1-step: Parsing pdf to text, filtering only text-based math problems; 2-step: Manual categorization of problems by following math areas: Algebra, Logic and Probability; 3-step: Data cleaning: filtering out bad parsed problems and adding new problems to balance the dataset; 4-step: Creating new columns of problem description and options in English language by using Google Translator; 5-step: Manually solving problems in Russian and English languages and adding a category column; 6-step: Statistical analyzing with Pandas and Tiktoken libraries of Python for the understanding the scale of the dataset.

Institutions

Kazakh British Technical University

Categories

Mathematics, Statistics, Algebra, Natural Language Processing, Problem Solving, Natural Language Generation, Arithmetic, Logical Thinking

Funding

The Committee of Science of the Ministry of Education and Science of the Republic of Kazakhstan

AP23489782

Licence