Data and code for emission-prioritized explainable AI auditing of circular-economy and carbon-neutrality integration in project environmental assessment documents
Description
This research data package supports the manuscript entitled “Emission-prioritized explainable AI for auditing circular-economy and carbon-neutrality integration in project environmental assessment documents”. It contains processed document-level features, sentence-level evidence, country-sector indicators, topic/model validation outputs, text-quality diagnostics, manual annotation validation files and data-related scripts used to reproduce the main analytical results. Source links and retrieval metadata for the public document and indicator repositories are provided in the package to support transparency and reuse.
Files
Steps to reproduce
Unzip the data package and review core_data/README.md, core_data/core_data_inventory.csv and core_data/source_link_inventory.csv for file-level descriptions and source metadata. The main analytical results can be checked using the processed data layers in core_data, including document-level features, sentence-level evidence, country-sector indicators, text-quality diagnostics, model/topic validation outputs and manual annotation validation files. For data-layer reproduction, run the scripts in core_scripts in the following order: hpc_build_worldbank_features_py36.py, hpc_sentence_evidence_parallel_py38.py, hpc_country_sector_indicator_panel_py38.py, hpc_fulltext_quality_sentence_samples_py38.py and hpc_validation_bootstrap_parallel_py38.py. These scripts generate the corresponding processed data layers listed in core_data/core_data_inventory.csv. Python 3.8 is recommended for most scripts; hpc_build_worldbank_features_py36.py was prepared for Python 3.6 compatibility.
Institutions
- Wuhan University of Science and TechnologyHubei, Wuhan