Cognitive Distortion Dataset for Text Classification in Bahasa Indonesia

Name: Cognitive Distortion Dataset for Text Classification in Bahasa Indonesia
Creator: Hendra Suputra
Published: 2025-02-06T17:35:54.494Z
Keywords: Psychology, Mental Health, Classification System, Text Mining, Binary Classification

Suputra, Hendra; Linawati, Linawati; Sastra, Nyoman Putra; Sukadarmika, Gede; Ariwilani, Ni Made; Desira Swandi, Ni Luh Indah

doi:10.17632/k84bkv8dkt.1

Cognitive Distortion Dataset for Text Classification in Bahasa Indonesia

Published: 6 February 2025| Version 1 | DOI: 10.17632/k84bkv8dkt.1

Contributors:

Hendra Suputra,

,

Description

This dataset is text data related to cognitive distortion sentences that are closely related to thought disorder. This is the first dataset of cognitive distortion sentences in Indonesian. This dataset is a collection of distortion/non-distortion sentences generated from online questionnaire answers. The questions are compiled by experts in this case a psychologist. Annotation is also done by experts to obtain distortion classes. The distribution of existing cognitive distortion classes is adjusted to the theory of Burns, D.D. (1999) in the book "The Feeling Good Handbook". The total sentence data is 4665, there are complete sentences and parts of sentences that are distortion parts flanked by the "$" sign, along with labels from two annotators in separate columns.

Files

Steps to reproduce

The dataset in this study was collected using the questionnaire method. The questionnaire contains everything from personal data, visits to psychologists to questions about life. The questionnaire is intended for Indonesians aged 17 and over. The question model proposed in this study has been compiled based on discussions with experts in the field of psychology or a psychologist. The questionnaire was distributed online through the Google Form platform. There were 593 respondents in the process. Then the experts analyzed and annotated each answer given by the respondents. The process then produced a dataset consisting of 4665 sentences.

Institutions

Universitas Udayana

Cognitive Distortion Dataset for Text Classification in Bahasa Indonesia

Description

Files

Steps to reproduce

Institutions

Categories

Licence