2019 CA Ridgecrest earthquake word libraries and supporting data results

Published: 10 June 2021| Version 4 | DOI: 10.17632/z9xjcmg6s2.4
Lingyao Li


The dataset consists of files related to the 2019 CA Ridgecrest earthquake, including the dictionaries of words to clean the data and our time-series results for damage level. Twitter Standard Search API was utilized with key search terms “earthquake” to search against related tweets from 07/04/2019 to 07/10/2019 (UTC). The original data were stored in JavaScript Object Notation (.json) files, which were converted to Excel (.xlsx) files for subsequent processing. The word patterns files contain words and phrases to filter tweets related to the 2019 CA Ridgecrest earthquake. The result data files contain related temporal and spatial results to support our research findings to the article "Social media crowdsourcing for rapid damage assessment following a sudden-onset natural hazard event" (still under review). We attach these files for the sole purpose of validating our research results. The last data file lists the tweet data (only include the tweet id) that we've used for our research. These tweets are filtered damage-related tweets to 2019 CA Ridgecrest earthquake. Please note that restrictions apply to the availability of these data, which were used under the license of Twitter Inc. for the current study, and so are not publicly available. The tweet IDs are however available in this data file, but the original tweets/posts are available with permission of Twitter Inc.



University of Maryland at College Park


Earthquake Hazard, Social Media Analytics