Uzbek Medical Entity Benchmark (UZ-EDBench)
Published: 20 March 2026| Version 1 | DOI: 10.17632/hnfmrknzz9.1
Contributors:
Botir Elov, , Description
UZ-EDBench is a structured Uzbek-language dataset designed for medical entity recognition and classification in a low-resource setting. The dataset is distributed in TSV format and consists of annotated tokens and domain-specific entity labels. It includes: a main annotated corpus (UZ-EDBench.tsv) a structured list of medical specialists (UZ-EDBench.Doctors.tsv) This dataset addresses the lack of: Uzbek medical NLP benchmarks annotated corpora for clinical entity extraction structured taxonomies of medical specialists
Files
Institutions
Categories
Natural Language Processing, Health