Corpus of Sarcasm in Twitter Conversations

Published: 21-02-2018| Version 1 | DOI: 10.17632/fn2mmff85g.1
Gavin Abercrombie


A corpus of two-part author-audience Twitter conversations, with associated manually annotated sarcasm polarity labels. The corpus is presented as a csv file in the format author, audience, label, where 'author' is the ID number of the target Tweet, 'audience' is the ID number of the other tweet in the conversation, and 'label' is the hand-annotated positive (1) or negative (0) sarcasm class label.


Steps to reproduce

Following Twitter terms of service, only ID numbers of each tweet in the corpus are available here. Text and associated metadata can be recovered using these ID numbers from the Twitter API.