Code and data for “From Face to Relations: Politeness strategies in Enron’s workplace email”

Published: 25 March 2026| Version 1 | DOI: 10.17632/7b2t8hdy94.1
Contributors:
,

Description

This repository contains the data and code supporting the study “From Face to Relations: Politeness Strategies in Enron’s Workplace Email.” It enables full transparency and reproducibility of the analytical pipeline, including data construction, network modeling, and statistical analysis. The folder discussion_output provides the processed datasets used to generate the figures in the article. These files correspond directly to the visualizations reported in the Discussion section and can be used to reproduce all plotted results. The files case_candidates_by_PDR (case1–4).csv and case_candidates_by_role (case5–10).csv document the selection procedure for qualitative case analyses. The former identifies candidate emails based on the P–D–R framework (Power, Distance, and Imposition), while the latter selects cases according to organizational roles within Enron (e.g., CEO and other hierarchical positions). These files ensure that all illustrative examples in the paper are traceable and systematically derived. The script discussion_experiments.py contains the full data analysis workflow used to produce the results in the Discussion section. It corresponds directly to the outputs stored in the discussion_output folder, including statistical summaries and plotting-ready data. The dataset messages_R_with_politeness.csv is the core message-level file. It includes email metadata and text-derived features, such as request identification, imposition score (R), five decomposed dimensions (cost/effort, urgency, risk/accountability, autonomy constraint, and dependency blocking), and multi-label annotations of four politeness strategies (bald-on-record, positive politeness, negative politeness, and off-record). The file node_metrics.csv contains node-level network attributes (e.g., centrality measures, clustering, community assignment, and power index), while dyadic_metrics.csv provides edge-level attributes (e.g., tie strength, relative power, and distance-related measures). The file global_metrics.txt reports overall network statistics. Together, these resources support a multi-level analysis that integrates pragmatics with social network structure, allowing other researchers to replicate, validate, and extend the findings.

Files

Institutions

Categories

Computational Linguistics

Licence