Hansard motion policies

Published: 12 Sep 2018 | Version 2 | DOI: 10.17632/j83yzp7ynz.2

Description of this data

This dataset is designed for the testing of automatic opinion-topic classification systems.

For experiments performed using this data, see:
Abercrombie, G. and Batista-Navarro, R., (2018). "Identifying Opinion-Topics and Polarity of Parliamentary Debate Motions". In Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis (WASSA)

The dataset consists of 592 UK parliamentary debate motions proposed in the UK House of Commons between 1997 and 2018 as well as the 'policy' categories that the votes on these motions have been classified as by the publicwhip.org.uk . It also includes metadata related to the debates that the motions were extracted from and the speakers who proposed them.

The dataset is presented as a csv file, where each column represents a motion and includes the following columns:

ID, "[date, speaker, party, title, motion, additional_information, party_array, date_continuous, policy 1087, policy 1065, policy 6709, policy 826, policy 6695, policy 6670, policy 1074, policy 1030, policy 1053, policy 856, policy 1110, policy 1051, policy 6694]"

ID = unique identification number from 0 to 591
data = the date on which the motion was proposed
speaker = the name of the MP who proposed the motion
party = party affiliation of the MP who proposed the motion
title = the title of the debate from which the motion is taken
motion = the textual content of the motion
additional_information = information such as the names of relevant documents or explanations of amendments is sometimes included in the transcripts, preceding the motion.
party_array = one hot encoding of the party feature: ['Con', 'Lab', 'LD', 'None', 'SNP', 'DUP', 'PC']
data_continuous = date feature scaled between 0.0 and 1.0
policies = one hot encoding of the policy labels:
policy 1087: "Asylum System — More strict"
policy 1065: "European Union — For"
policy 6709: "Further devolution to Scotland"
policy 826: "Homosexuality — Equal rights"
policy 6695: "More powers for local councils"
policy 6670: "Reduce Spending on Welfare Benefits"
policy 1074: "Schools — Greater Autonomy"
policy 1030: "Stop climate change"
policy 1053: "Terrorism laws — For"
policy 856: "Shift Powers from MPs in the Commons to Ministers"
policy 1110: "Increase VAT"
policy 1051: "Identity cards — For introduction"
policy 6694: "Higher taxes on alcoholic drinks"

Textual and metadata were extracted from https://www.theyworkforyou.com/pwdata/scrapedxml/debates/.

Policy data were extracted from https://www.publicwhip.org.uk/policies.php.

This dataset contains parliamentary information licensed under the Open Parliament Licence v3.0: https://www.parliament.uk/site-information/copyright-parliament/open-parliament-licence/

Experiment data files

Steps to reproduce


Latest version

  • Version 2


    Published: 2018-09-12

    DOI: 10.17632/j83yzp7ynz.2

    Cite this dataset

    Abercrombie, Gavin (2018), “Hansard motion policies”, Mendeley Data, v2 http://dx.doi.org/10.17632/j83yzp7ynz.2


Views: 34
Downloads: 5

Previous versions

Compare to version


The University of Manchester School of Computer Science, The University of Manchester


Social Sciences, Political Science, Data Science, Natural Language Processing, Statistical Natural Language Processing

Mendeley Library

Organise your research assets using Mendeley Library. Add to Mendeley Library


CC BY 4.0 Learn more

The files associated with this dataset are licensed under a Creative Commons Attribution 4.0 International licence.

What does this mean?

This dataset is licensed under a Creative Commons Attribution 4.0 International licence. What does this mean? You can share, copy and modify this dataset so long as you give appropriate credit, provide a link to the CC BY license, and indicate if changes were made, but you may not do so in a way that suggests the rights holder has endorsed you or your use of the dataset. Note that further permission may be required for any content within the dataset that is identified as belonging to a third party.