Abstracts_Ophthamology_NLP_dataset

Published: 1 February 2024| Version 1 | DOI: 10.17632/wgjsz4n4rb.1
Contributor:
Hina Raja

Description

Our retinal diseases (RenD) dataset comprises of 1000 articles sourced from PubMed, covering various conditions such as diabetic retinopathy (DR), glaucoma, diabetic macular edema (DME), age-related macular degeneration (AMD), cataract, dry eye, retinal detachment, and central serous retinopathy (CSR). To ensure accurate categorization, we enlisted the expertise of six domain specialists who meticulously annotated the articles based on abstracts. To ensure accuracy and reliability in the annotation process, each article in our dataset is reviewed and annotated by at least three individual annotators (see supplemental Table A1 for guidelines for the data annotation). This multiple-annotator approach helps mitigate potential biases and inconsistencies that could arise from a single annotator's perspective. Once the annotation was completed, the final label for each article is determined based on majority voting.

Files

Institutions

University of Tennessee Health Science Center

Categories

Ophthalmology, Natural Language Processing, Information Classification

Licence