COVID-19 RUMOR: a classified dataset of COVID-19 related online rumors in Brazilian Portuguese

Published: 01-08-2020| Version 2 | DOI: 10.17632/pz2j957rzc.2
Patricia Takako Endo,
Gleyson Rhuan Nascimento Campos,
Maria Eduarda de Lima Xavier,
Kayo Henrique Carvalho Monteiro,
Maicon Herverton Lino Ferreira da Silva Barros,
Ivanovitch Silva,
Breno Santos


The dataset is composed of rumors and non-rumors related to COVID-19 collected from three different online sources: (a) the Brazilian Ministry of Health official website, (b) a journalistic initiative named focused on debunking online rumors, including COVID-19, and (c) the O Globo news that provides a special track to follow rumors about COVID-19. The Brazilian Ministry of Health dataset ( is composed of 79 rumors and 05 non-rumors texts classified by the own Brazilian government; the dataset ( is composed of 951 rumors classified and debunked by a team of journalists; and the O Globo dataset ( is composed of 261 rumors and 03 non-rumors also classified by journalists. In total, the COVID-19 RUMOR dataset has 1291 rumors and 08 non-rumors.