Skip to main content
Exit comparison
Removed
Added

Datasets Comparison

Version 1

60,000 SCAPS-1D samples spanning 15 electron transport layers (ETLs) and hole transport layers (HTLs) combinations (TiO₂, SnO₂, ZnO × CuI, Cu₂O, Spiro-OMeTAD, P3HT, PTAA) under 900 lux cool-white and warm-white LEDs

Published:5 May 2026|Version 1|DOI:10.17632/3ds9wrkdty.1
Contributor:chih hsi PENG

Description

This dataset supports the study "Machine-learning-driven Pareto screening of transport layers for indoor CsPbI₂Br perovskite solar cells with zero-shot cross-light-source calibration." All device simulations were performed with SCAPS-1D v3.3.11 under two indoor LED spectra at 900 lux: cool-white (0.309 mW/cm²) and warm-white (0.278 mW/cm²), covering 15 electron-transport-layer/hole-transport-layer (ETL/HTL) combinations formed by three ETLs (TiO₂, SnO₂, ZnO) and five HTLs (CuI, Cu₂O, PTAA, P3HT, Spiro-OMeTAD). The four photovoltaic targets predicted by the machine learning models are power conversion efficiency (PCE), open-circuit voltage (Voc), short-circuit current density (Jsc), and fill factor (FF). The repository is organized into four folders. The data folder contains two CSV files (train_warm_0278.csv and train_cold_0309.csv, approximately 30,000 samples each) holding the simulated device parameters and corresponding photovoltaic outputs for each light source. The def folder provides the 15 SCAPS-1D device definition files (.def) used to generate the simulations, one per ETL/HTL combination. The code folder contains four Python scripts that reproduce all results in the paper: train_xgboost.py (Step 1) trains single-task XGBoost regressors with SHAP analysis; train_fttransformer.py (Step 2) trains the multi-task Feature Tokenizer + Transformer (FT-Transformer) with attention-weight analysis; ch4_2_spectral_analysis.py (Step 3) evaluates cross-light-source generalization across three training tiers (domain-aware, domain-blind, and source-only) with six calibration strategies; and ch4_3_material_analysis.py (Step 4) performs Monte Carlo robustness analysis, defect-tolerance ranking, and Pareto-frontier optimization over all 15 material combinations. The scripts must be run in order; Steps 3 and 4 can alternatively load the pre-trained weights provided in the models folder to skip retraining. The results folder contains all figures and CSV tables generated by the four scripts, organized by chapter section.

Categories

Solar Cell, Machine Learning

Licence

Creative Commons Attribution 4.0 International

Version 2

60,000 SCAPS-1D samples spanning 15 electron transport layers (ETLs) and hole transport layers (HTLs) combinations (TiO₂, SnO₂, ZnO × CuI, Cu₂O, Spiro-OMeTAD, P3HT, PTAA) under 900 lux cool-white and warm-white LEDs

Published:8 May 2026|Version 2|DOI:10.17632/3ds9wrkdty.2
Contributor:chih hsi PENG

Description

This dataset supports the study "Machine-learning-driven Pareto screening of transport layers for indoor CsPbI₂Br perovskite solar cells with zero-shot cross-light-source calibration." All device simulations were performed with SCAPS-1D v3.3.11 under two indoor LED spectra at 900 lux: cool-white (0.309 mW/cm²) and warm-white (0.278 mW/cm²), covering 15 electron-transport-layer/hole-transport-layer (ETL/HTL) combinations formed by three ETLs (TiO₂, SnO₂, ZnO) and five HTLs (CuI, Cu₂O, PTAA, P3HT, Spiro-OMeTAD). The four photovoltaic targets predicted by the machine learning models are power conversion efficiency (PCE), open-circuit voltage (Voc), short-circuit current density (Jsc), and fill factor (FF). The repository is organized into four folders. The data folder contains two CSV files (train_warm_0278.csv and train_cold_0309.csv, approximately 30,000 samples each) holding the simulated device parameters and corresponding photovoltaic outputs for each light source. The def folder provides the 15 SCAPS-1D device definition files (.def) used to generate the simulations, one per ETL/HTL combination. The code folder contains four Python scripts that reproduce all results in the paper: train_xgboost.py (Step 1) trains single-task XGBoost regressors with SHAP analysis; train_fttransformer.py (Step 2) trains the multi-task Feature Tokenizer + Transformer (FT-Transformer) with attention-weight analysis; ch4_2_spectral_analysis.py (Step 3) evaluates cross-light-source generalization across three training tiers (domain-aware, domain-blind, and source-only) with six calibration strategies; and ch4_3_material_analysis.py (Step 4) performs Monte Carlo robustness analysis, defect-tolerance ranking, and Pareto-frontier optimization over all 15 material combinations. The scripts must be run in order; Steps 3 and 4 can alternatively load the pre-trained weights provided in the models folder to skip retraining. The results folder contains all figures and CSV tables generated by the four scripts, organized by chapter section.

Categories

Solar Cell, Machine Learning

Licence

Creative Commons Attribution 4.0 International