SITUATIONAL AWARENESS NAVIGATING THE POSTNORMAL—E-GOVERNMENT TEXT MINING

Published: 10 December 2021| Version 1 | DOI: 10.17632/492ydjywwk.1
Contributor:
xiaolin lao

Description

1) COVID-9 comment data on Wuhan City message board. 2) The data is crawled using Python. The collection URL is http://liuyan.cjn.cn/. The collection date is October 27th, 2021. 3) Firstly, the search was performed using the Chinese character "疫情" as a keyword. Secondly, "View Details" was clicked on each message to go to the secondary page. Thirdly, strings of "title," "inquiry code," "user ID," "message time," "civic message content," "government response time," and "government response content" were crawled under the secondary pages. Besides, each message matches a unique query code, which can filter for duplicate values after crawling the messages. Finally, a total of 13598 no-repeat messages (3490054 Chinese characters) have been collected.

Files

Steps to reproduce

The data is crawled using Python. The collection URL is http://liuyan.cjn.cn/. The collection date is October 27th, 2021. Firstly, the search was performed using the Chinese character "疫情" as a keyword. Secondly, "View Details" was clicked on each message to go to the secondary page. Thirdly, strings of "title," "inquiry code," "user ID," "message time," "civic message content," "government response time," and "government response content" were crawled under the secondary pages. Besides, each message matches a unique query code, which can filter for duplicate values after crawling the messages. Finally, a total of 13598 no-repeat messages (3490054 Chinese characters) have been collected.

Categories

Textual Database

Licence