Large language model-assisted natural language processing reveals stigmatizing discourse regarding psoriasis on social media

Published: 13 October 2025| Version 1 | DOI: 10.17632/9xxyv36zdn.1
Contributors:
Zijun Wu,
,
,
,
,

Description

This dataset provides supplementary materials supporting the Research Letter accepted for reconsideration in JAAD International. It contains detailed methodological, computational, and validation information omitted from the main text due to space constraints. Specifically, the dataset includes: 1.Detailed Methodology – full description of data preprocessing, traditional-to-simplified Chinese text normalization, rule-based classification schema (26 rules), and model onfiguration parameters for DeepSeek-V3. 2.Model Prompts and Validation Workflow – structured prompts, fault-tolerant parsing logic, and manual double-coding procedures used to benchmark model performance. 3.Evaluation Outputs – performance metrics (precision, sensitivity, F1-score) for stigmatization, destigmatization, and neutral categories, including confusion matrix data.

Files

Steps to reproduce

The supplementary materials are descriptive and do not involve data requiring reproduction. All figures were generated using aggregated, de-identified comment data described in the main article.

Institutions

Central South University Xiangya School of Public Health

Categories

Social Media, Natural Language Processing, Psoriasis, Social Stigma, Large Language Model

Licence