Content moderation from Twitter to X: policies, enforcement and community notes

Published: 10 March 2025| Version 2 | DOI: 10.17632/d9ggkzp7bc.2
Contributors:
Emillie de Keulenaar,

Description

This dataset contains: (1) Twitter/X policies from 2006 to late 2024 in markdown format; (2) calculations of policy changes over time; (3) Dutch-speaking community notes and corresponding posts, categorised by topic and user types; and (4) temporal search ranking of thousands of posts about immigration, dating from September to October 2023.

Files

Steps to reproduce

The collection of platform policies, Community Notes, X posts, and their corresponding metadata (ranking, removal status, etc.) was conducted through a structured methodological approach that combined automated web scraping, machine learning-based classification, and statistical analysis. To examine changes in X’s moderation policies over time, all available policy documents from 2006 to early 2025 were collected via the Internet Archive’s Wayback Machine and the Platform Governance Archive, after which they were analyzed using difflib.SequenceMatcher to track textual modifications. The researcher categorized policies into distinct moderation regimes, identifying shifts from strict enforcement through deletions and suspensions (2013-2017) to a more modular and demotion-based strategy (2017-2022) and, under Musk, an emphasis on “freedom of speech, not reach” (2022-present). Community Notes were retrieved by scraping X’s official datasets, filtering them through Google Sheets’ DETECTLANGUAGE function to isolate Dutch-language entries. A script then extracted the full metadata of the corresponding posts, including engagement metrics (likes, replies, reposts, views) and user information (follower count, description). To classify both posts and users, a manual sample of the top 100 most-engaged posts was first coded into topic categories, after which GPT-4o-mini was prompted to categorize all remaining posts into predefined topics like immigration, climate change, and electoral politics, refining these categories iteratively through “snowballing” techniques. Users were similarly classified based on their descriptions and engagement behavior, with labels distinguishing politicians (left, right, centrist), journalists, activists, influencers, and businesses. To assess demotion practices, a list of search queries was designed to capture Dutch-language posts on immigration-related topics, which were then scraped daily over one month (September–October 2024). Each post’s position in search rankings was recorded alongside engagement statistics, allowing for an analysis of how X’s algorithms treat different types of content over time. OpenAI’s Moderation API was employed to assign a “hate score” to each post, enabling a correlation analysis between hate scores, ranking positions, and engagement levels. Pearson correlation coefficients were used to measure whether higher hate scores were associated with lower rankings and reduced visibility, with log transformation applied to normalize view counts. Additionally, to track removals, the dataset was revisited in February 2025 to check whether high hate-scoring posts had been deleted or suspended, classifying removals into voluntary deletions, suspensions, and restrictions imposed by X. Given that Community Notes labeled as “Needs More Ratings” overwhelmingly dominated the dataset, a proportional weighting formula was applied to adjust for distortions in the statistical analysis.

Institutions

Rijksuniversiteit Groningen, Universiteit van Amsterdam

Categories

Moderation, Twitter

Licence