Dataset of Tweets and News Media on COVID-19 in Portuguese

Published: 25 July 2020| Version 3 | DOI: 10.17632/vhxdgjfjnk.3
Tiago de Melo


Collection of 3,925,366 tweets and 18,413 online news around the online discussion about COVID-19 in Brazil. The data from Twitter were collected through Twitterscraper Python library and we considered a set of keywords in Portuguese regarding to COVID-19. In order to facilitate the identification of tweets that have hashtags, media and retweets for researchers or data enthusiasts, we created three specific datasets for each of these categories. The news on COVID-19 was collected from the UOL portal, the most popular Brazilian website. All the data were gathered from January to May, 2020. These dataset can attract the attention from communities such as data science, social science, natural language processing, tourism, infodemiology, and public health.



Universidade do Estado do Amazonas


Social Sciences, Computer Science, Health Informatics