Question-Answer Pairs Goddess Durga

Published: 29 October 2024| Version 1 | DOI: 10.17632/6hcx53kywr.1
Contributors:
Tri Lathif Mardi Suryanto,
,

Description

Durga is one of the most important goddesses in the Hindu tradition, symbolising strength, protection and victory over evil. She is often depicted as a bold figure, riding atop a lion or tiger or cow, with many weapons symbolising strength and justice. Durga is worshipped in various festivals, most notably Durga Puja, which celebrates Her victory over the giant Mahishasura. The symbolism and narratives surrounding Durga include themes of courage, sacrifice, and protection, making her a highly revered figure in Hindu culture and religion. Research on Durga has great potential in the field of Natural Language Processing (NLP). Durga-related datasets, such as Goddess Durga Question-Answer Pairs, can be used to build effective question-answering models, aiding in the education and dissemination of information about Hindu mythology and teachings. In addition, future research could develop an AI-based chatbot capable of automatically answering Durga-related questions, increasing the accessibility of religious information. Furthermore, sentiment analysis can be conducted to understand how people perceive Durga and the values she represents. This dataset can also be used in cross-cultural research, analyzing comparisons between views on deities in different religious traditions. By utilizing NLP techniques, research on Durga can expand our understanding of the influence and socio-cultural significance of this figure in the modern context.

Files

Steps to reproduce

Goddess Durga Question-Answer Pair dataset, we used a mixed-method approach. Firstly, we collected question-answer pairs from various print sources, such as books and articles on Goddess Durga, as well as reliable online platforms such as Q&A forums and educational websites. Data extraction involved manual transcription and web mining techniques using Python. The collected data of 4,179 question-answer pairs and 20,894 words were then compiled in CSV format, followed by cleaning and preprocessing using the spaCy NLP library for tokenisation and lemmatisation. Lastly, the dataset was validated by expert.

Institutions

Universitas Negeri Malang, Koninklijk Instituut voor Taal- Land- en Volkenkunde

Categories

History, Education, Natural Language Processing, Culture, Questioned Document

Licence