A Twitter Dataset for Monkeypox, May-Dec, 2022

Published: 29 December 2022| Version 1 | DOI: 10.17632/242whtdt3m.1
Zahra Nia,


The number of Monkeypox infections is increasing around the globe. The world has not recovered from damages caused by COVID-19 pandemic yet, and could not bear another catastrophe. Therefore, it is very important to contain Monkeypox outbreak. This dataset includes 2400202 tweet ids and user ids gathered using Twitter API Academic researcher account and with keywords monkeypox or “monkey pox” or “viruela dei mono” or “variole du singe” or “variola do macoco”. The dataset includes all the geotagged and non-geotagged tweets posted during the Monkeypox outbreak of 2022, from May first to December twenty-fifth, 2022, and from anywhere in the world. Each row in the dataset is associated to a different tweet. The dataset includes two columns, TweetID and AuthorID. TweetID provides the id of the tweet and AuthorID provides the id of the user who posted the tweet. The first row of the dataset is the header. Twitter is increasingly becoming popular among people for sharing ideas, opinions, concerns, and experiences. Previously, Twitter has successfully been used by researchers in different areas of study. This dataset is made available to researchers to study various aspects of Monkeypox such as trend prediction, stigmatization of minor and marginalized populations, misinformation and fake news detection, and hotspot identification to help control the outbreak.


Steps to reproduce

In compliance with Twitter's developer agreement and policy [1], only tweet ids and user ids are shared with public. To access other metadata such as create date, number of likes, replies, retweets, language, geolocation, and user-specified location, the tweet ids need to be hydrated. One popular software that could hydrate tweets is DocNow hydrator [2]. After installing the hydrator and authorizing it using your Twitter account, a file containing the tweet ids should be uploaded to the hydrator. By default, the hydrator returns the tweets and their metadata in .json format. However, it could be set to return the results in other formats such as .csv, as well. [1] Twitter, Developer agreement and policy, Oct 2022, (Accessed Dec 28, 2022), https://developer.twitter.com/en/developer-terms/agreement-and-policy. [2] DocNow Hydrator, Aug 2021, (Accessed: Dec 28, 2022), GitHub - DocNow/hydrator: Turn Tweet IDs into Twitter JSON & CSV from your desktop!.


York University


Smallpox, Twitter, Pandemic


Canada’s International Development Research Centre (IDRC) and Swedish International Development Cooperation Agency (SIDA)