A corpus designed to study preprints produced during the Covid-19 crisis and to make comparative studies with the pre-pandemic period

Published: 11-03-2021| Version 1 | DOI: 10.17632/rn9b93x5d4.1
Frederique Bordignon,
Liana Ermakova,


This dataset has been created to allow comparative studies of abstracts associated with preprints issued in response to the COVID-19 pandemic (from 01/01/2020 to 12/04/2020) relative to abstracts produced in 2019, the closest pre-pandemic period. The dataset has 2 files: - a txt file with the queries we ran in Dimensions and Lens to create the whole corpus and retrieve metadata - a csv file with the metadata for all preprints in the corpus and the positive, negative and hedge words we extracted with CorTexT Manager tool.