Event Extraction Andersen's Fairy Tales

Published: 25 April 2024| Version 1 | DOI: 10.17632/22v3kcgks3.1
Contributor:
Erna Daniati

Description

This dataset is the result of extraction from fairy tales by Hans Cristians Andersen. This fairy tale was taken from the official Gutenberg website and then carried out an extraction process. Fairy tales are extracted into several sentences and their entity domains. Apart from that, it is also extracted into the number of words and sentences. This dataset is of type json with attributes, title, number of sentences, number of words, and events.

Files

Steps to reproduce

1. Retrieve data from the Gutenberg repository with the following link: https://www.gutenberg.org/cache/epub/1597/pg1597.txt 2. Calculate the number of words and sentences. 4. Identify the events in the fairy tale. 5. Identify the entity domain in the event. 6. Determine the entities involved in the fairy tale events.

Institutions

Universitas Negeri Malang

Categories

Natural Language Processing, Text Extraction

Licence