Uzbek Medical Entity Benchmark (UZ-EDBench)

Published: 20 March 2026| Version 1 | DOI: 10.17632/hnfmrknzz9.1
Contributors:
Botir Elov,
,

Description

UZ-EDBench is a structured Uzbek-language dataset designed for medical entity recognition and classification in a low-resource setting. The dataset is distributed in TSV format and consists of annotated tokens and domain-specific entity labels. It includes: a main annotated corpus (UZ-EDBench.tsv) a structured list of medical specialists (UZ-EDBench.Doctors.tsv) This dataset addresses the lack of: Uzbek medical NLP benchmarks annotated corpora for clinical entity extraction structured taxonomies of medical specialists

Files

Categories

Natural Language Processing, Health

Licence