Doctor's Answer Text Dataset in Indonesian Contains Information on Medical Interview Patterns

Published: 14 May 2023| Version 1 | DOI: 10.17632/p8d5bynh3m.1
Safitri Juanita,


The dataset was collected by extracting data from, an Indonesian website that offers free Online Health Consultation (OHC). This dataset contains a collection of Indonesian health consultation texts from December 8, 2014, to February 28, 2021, with 497,974 raw data. The labelled data contains 500 doctor's answer texts randomly selected from the raw data that were manually annotated by four medical experts using the six medical interview functions in the research article (Doi: 1. Relationship Building (FR), 2. Gathering Information (GI), 3. Providing Information (PI), 4. Decision Making (DM), 5. Enabling disease- and treatment-related behaviours (EDTRB), 6. Responding to Emotions (RE). The clean dataset contains the pre-processed labelling dataset.



Institut Teknologi Sepuluh Nopember


Computer Science, Natural Language Processing, Consultation in Healthcare, Text Mining, Virtual Consultation