Enhanced Dataset of Citizen Centric Complaints and Grievances on Twitter

Published: 23 Jul 2016 | Version 1 | DOI: 10.17632/w2cp7h53s5.1
Contributor(s):

Description of this data

The dataset "Complaints_Reports_Data.sql" contains the public complaint tweets posted on 4 public service accounts of Indian Government (@RailMinIndia, @IncomeTaxIndia, @DelhiPolice and @dtpTraffic). Complaints_Reports_Data.sql file contains the records of raw tweets, users, hashtags, user mentions and other contextual metadata of tweets and bloggers. In this dataset, we also share a sample of tweets pre-processed in 3 steps ("pre1", "pre3" and "pre4")- hashtag expansion, spell error correction and internet & slang expansion.
Metadata of each table is given below:

Table 1: Annotated: tweet_ID, text, class (complaint or unknown)
Table 2: Hashtags: tweet_ID, hashtag
Table 3: Posts: tweet_ID, text, url_count, image_count, video_count, user_id, timestamp, organization (Indian Govt account), language, latitude, longitude, replied_to_tweet_id, replied_to_user_id, retweet
Table 4, 5, 6: Pre1, Pre3, Pre4: tweet_ID, text, organization
Table 7: User_Mentions: tweet_ID, user_ID
Table 8: Users: user_ID, screen_name, name, verified?, location, created_at

Experiment data files

Steps to reproduce

mysql -u root -p;
enter your password
create database citizen_complaints_sampled;
use citizen_complaints_sampled;
source Complaints_Reports_Data.sql;

Related links

Latest version

  • Version 1

    2016-07-23

    Published: 2016-07-23

    DOI: 10.17632/w2cp7h53s5.1

    Cite this dataset

    Agarwal, Swati; Mittal, Nitish; Sureka, Ashish (2016), “Enhanced Dataset of Citizen Centric Complaints and Grievances on Twitter”, Mendeley Data, v1 http://dx.doi.org/10.17632/w2cp7h53s5.1

Statistics

Views: 68
Downloads: 11

Institutions

Indraprastha Institute of Information Technology Delhi

Categories

Information Retrieval, Data Mining, Social Media, Patient Social Context, Government Computing, Textual Databases, Public Record

Licence

CC BY 4.0 Learn more

The files associated with this dataset are licensed under a Creative Commons Attribution 4.0 International licence.

What does this mean?

This dataset is licensed under a Creative Commons Attribution 4.0 International licence. What does this mean? You can share, copy and modify this dataset so long as you give appropriate credit, provide a link to the CC BY license, and indicate if changes were made, but you may not do so in a way that suggests the rights holder has endorsed you or your use of the dataset. Note that further permission may be required for any content within the dataset that is identified as belonging to a third party.

Report