Corpus of Sarcasm in Twitter Conversations

Published: 21 February 2018| Version 1 | DOI: 10.17632/fn2mmff85g.1
Contributor:
Gavin Abercrombie

Description

A corpus of two-part author-audience Twitter conversations, with associated manually annotated sarcasm polarity labels. The corpus is presented as a csv file in the format author, audience, label, where 'author' is the ID number of the target Tweet, 'audience' is the ID number of the other tweet in the conversation, and 'label' is the hand-annotated positive (1) or negative (0) sarcasm class label.

Files

Steps to reproduce

Following Twitter terms of service, only ID numbers of each tweet in the corpus are available here. Text and associated metadata can be recovered using these ID numbers from the Twitter API.

Categories

Social Sciences, Computer Science, Computational Linguistics, Data Science, Natural Language Processing, Statistical Natural Language Processing

Licence