Reproducibility Package: Basel-Aware Regime-Weighted Stacking (BARWS) for Multi-Asset Value-at-Risk Forecasting

Published: 21 May 2026| Version 2 | DOI: 10.17632/3rf45wghy6.2
Contributor:

Description

This package contains the R source code and result artifacts for the Basel-Aware Regime-Weighted Stacking (BARWS) methodology from the author's Doctor of Computer Science dissertation at Bina Nusantara University. BARWS extends ensemble Value-at-Risk forecasting via (1) augmenting pinball loss with a Basel III Traffic Light surcharge, smooth Kupiec coverage penalty, and Christoffersen clustering penalty for joint accuracy + regulatory compliance optimization; and (2) regime-conditional weights (Normal/Volatile/Crisis) via EWMA-then-k-means clustering with optional soft mixture-of-experts. The stacking ensemble combines seven base learners — HS, VCV, MCS, LDA-FFT, LDA-Panjer, XGBoost (conformal calibration), and LSTM quantile regressor — over a five-asset portfolio (USDIDR, EURUSD, USDJPY, XAUUSD, BTCUSD) across three regulatory periods: A (2014-09-17 → 2019-12-31, pre-COVID), B (2020-01-01 → 2021-12-31, COVID shock), and C (2022-01-01 → 2026-03-10, post-COVID + crypto winter). The bundle includes: (a) twelve pedagogical R scripts (LAMPIRAN A–L) matching the dissertation appendix structure with full mathematical exposition; (b) three runnable master scripts — dcs-bky-std-v28.R (Phase-1 batch, 12 configurations = 3 periods × 4 loss modes), dcs-bky-std-v28b.R (canonical for the accompanying BARWS paper: reproduces +1.98 bps RCE and ΔL_pin = −2.95 [95% CI −6.30, −0.18] for Period A ENS_ALL), and dcs-bky-std-v28d.R (symmetric-Kupiec variant; empirically identical winners to v28b across all five feasible cells, included as validation companion); (c) full Phase-2 lambda-sweep Pareto-frontier output (246 child runs, 9×9 grid × 3 periods); (d) automation script (populate-results.R) mapping raw output to dissertation tables/figures; (e) representative single-run output from the Period A winning configuration (λ_basel=0.25, λ_kup=2, ENS_ALL; +1.98 bps RCE, Basel YELLOW 3.65, Kupiec p=0.4466); and (f) complete manifest of all Chapter IV artifacts (16 entries: 8 tables + 8 figures). WHAT'S NEW IN VERSION 2 (2026-05-20): Added master scripts v28b (paper canonical) and v28d (validation companion). Populated results/tables/ and results/figures/ with all eight final tables (4.1–4.8) and figures (4.1–4.8). Replaced data/sample_output/ with the Period A winning configuration. Extended manifest from 5 to 16 entries. Raw price data is fetched live from Yahoo Finance and Binance API on each run and is not redistributed (subject to provider ToS). All R code, documentation, and derived outputs released under CC BY 4.0. Reproduction time: ~5 h batch, ~14.5 h Phase-2 sweep (246 runs).

Files

Steps to reproduce

SYSTEM REQUIREMENTS - R version 4.2.0 or higher (tested on R 4.3.2 and 4.4.2, macOS / Linux / Windows) - Internet connection (for live price data fetch from Yahoo Finance and Binance API) - ~5 hours runtime for batch mode, ~14.5 hours for lambda sweep mode (246 runs) R PACKAGE DEPENDENCIES install.packages(c("quantmod", "xts", "zoo", "dplyr", "ggplot2", "fitdistrplus", "xgboost", "torch", "jsonlite", "kableExtra")) STEP 1 — DOWNLOAD AND EXTRACT Download the ZIP archive from Mendeley Data. Extract to a working directory. STEP 2 — REPRODUCE BATCH RESULTS (Chapter IV Tables 4.1–4.4, Figures 4.1–4.7) The canonical script for the accompanying BARWS paper is dcs-bky-std-v28b.R (hybrid smooth-proxy variant). By default, batch_run_all <- TRUE spawns 12 child Rscript processes (3 regulatory periods × 4 loss configurations): cd mendeley_data Rscript --no-save --no-restore dcs-bky-std-v28b.R This produces ./output_batch/ with per-configuration subfolders containing CSV daily series, PNG backtest plots, and HTML diagnostic reports. For the symmetric-Kupiec validation variant, substitute dcs-bky-std-v28d.R. The original Phase-1 script dcs-bky-std-v28.R is retained for backward compatibility. STEP 3 — REPRODUCE LAMBDA SWEEP RESULTS (Chapter IV Tables 4.5–4.8, Figure 4.8) Edit dcs-bky-std-v28b.R: set batch_run_all <- FALSE and lambda_sweep_run <- TRUE. Then: Rscript --no-save --no-restore dcs-bky-std-v28b.R This produces ./output_lambda_sweep/ with the full Pareto frontier walk over 9 × 9 = 81 lambda combinations per period × 3 ensembles = 246 child runs (~14.5 hours wall-clock). The Period A winning configuration (λ_basel = 0.25, λ_kup = 2, ENS_ALL) yields +1.98 bps RCE and ΔL_pin = −2.95 (95% CI [−6.30, −0.18], B = 1,000 paired block bootstrap). STEP 4 — AUTO-POPULATE DISSERTATION RESULTS FOLDER The automation script renames raw output to citation-friendly names: Rscript populate-results.R This populates ./results/tables/table-4-X-*.csv and ./results/figures/figure-4-X-*.png, plus extends manifest.csv mapping every Chapter IV reference to its source file (16 entries total: 8 tables + 8 figures). STEP 5 — CAPTURE ENVIRONMENT METADATA (optional) After completion, regenerate the session info snapshot: Rscript session_info/regenerate.R This writes session_info/R_sessionInfo.txt with full package versions, BLAS backend, and RNG state. The bundle already includes the author's environment snapshot; this step is only needed if you want to capture your own. VERIFICATION Compare your reproduced output against data/sample_output/ which contains the Period A winning configuration single run (λ_basel = 0.25, λ_kup = 2, ENS_ALL, baselaware_regime mode; +1.98 bps RCE) bundled with the package. For detailed methodology, see the 12 pedagogical scripts LAMPIRAN-R-SCRIPT-A,... LAMPIRAN-R-SCRIPT-L which document each component (configuration, LDA, ML, ensemble meta-learner, regime detection, lambda sweep).

Institutions

Categories

Computer Science, Econometrics, Financial Risk Management

Licence