Wapp-RPA: Public Analytic Pipeline for WhatsApp Digital Evidence in Legal and Forensic Psychology
Description
Companion public repository for the manuscript WhatsApp Digital Evidence in Legal and Forensic Psychology: A Computational Framework for Reconstructive Psychological Assessment. The repository contains non-identifying R scripts, parser logic, preprocessing rules, synthetic example data, output templates, documentation, and aggregate supplementary-table generation materials. The protected raw WhatsApp transcript and the licensed LIWC2007 Spanish dictionary are not included.
Files
Steps to reproduce
1. Download and unzip the repository. 2. Open R or RStudio from the repository root directory. 3. Install the required R packages if they are not already installed: install.packages(c( "readr", "stringr", "stringi", "dplyr", "lubridate", "tibble", "tidyr", "ggplot2", "scales", "openxlsx" )) 4. Set the input file and output directory. The repository includes a synthetic non-identifying WhatsApp example for reproducibility: Sys.setenv(WAPP_INPUT_FILE = "synthetic_data/synthetic_whatsapp_export_example.txt") Sys.setenv(WAPP_OUTPUT_ROOT = "outputs") 5. Run the public scripts sequentially from the repository root: source("scripts/01_corpus_characterization.R", encoding = "UTF-8") source("scripts/02_temporal_organization.R", encoding = "UTF-8") source("scripts/03_participation_turn_taking.R", encoding = "UTF-8") source("scripts/04_response_latency.R", encoding = "UTF-8") source("scripts/05_session_structure_bursts_density.R", encoding = "UTF-8") source("scripts/06_lexical_discourse_patterning.R", encoding = "UTF-8") source("scripts/07_stylistic_paratextual_placeholders.R", encoding = "UTF-8") source("scripts/08_exploratory_affective_signals_LIWC2007.R", encoding = "UTF-8") source("scripts/09_robustness_sensitivity_summary_LIWC2007.R", encoding = "UTF-8") source("scripts/10_focal_episode_selection_contextual_verification_LIWC2007.R", encoding = "UTF-8") source("scripts/10B_auto_contextual_sufficiency_scoring_LIWC2007.R", encoding = "UTF-8") source("scripts/10C_integrate_contextual_sufficiency_outputs.R", encoding = "UTF-8") 6. Optional: compile the aggregate supplementary tables workbook: source("scripts_support/create_supplementary_tables.R", encoding = "UTF-8") 7. Outputs will be written to the outputs/ directory. The supplementary workbook will be written as Supplementary_Tables_Wapp.xlsx unless another filename is specified through WAPP_SUPPLEMENTARY_TABLES_FILENAME. 8. LIWC2007 analyses are optional and require a licensed local copy of the LIWC2007 Spanish dictionary. The dictionary is not included in this repository. To enable LIWC-based outputs, define the dictionary path before running the LIWC-dependent scripts: Sys.setenv(WAPP_LIWC_DICT = "path/to/licensed_LIWC2007_Spanish_dictionary.csv") If WAPP_LIWC_DICT is not defined, the LIWC-dependent scripts will run in public mode and record that LIWC2007 outputs were skipped. 9. The protected raw WhatsApp transcript used in the manuscript is not shared. Reproducibility therefore concerns the analytic workflow, parser logic, preprocessing rules, synthetic example execution, output structure, and aggregate table-generation process, not reanalysis of the protected private corpus.
Institutions
- Pontificia Universidad Católica del EcuadorPichincha, Quito