Data and replication materials for the LESG index: logistics, governance, sustainability and development readiness
Description
This dataset supports an empirical investigation into national systemic readiness for sustainable development beyond conventional outcome-based indicators such as Gross Domestic Product (GDP). The underlying research hypothesis is that development readiness is not adequately captured by isolated measures of income, logistics performance, or sustainability outcomes, but instead emerges from the structural coherence among logistics capability, governance quality, and environmental and social sustainability conditions. The dataset operationalizes this concept through the construction of the LESG (Logistics–Environmental, Social and Governance) index using country-level data for a balanced cross-sectional sample of 123 countries. Four widely used international indicators are included: the World Bank’s Logistics Performance Index (LPI), the Worldwide Governance Indicators (WGI), the Environmental Performance Index (EPI), and the Sustainable Development Goals (SDG) Index. All variables are derived from publicly available sources and represent relatively stable structural characteristics rather than short-term economic fluctuations. Prior to analysis, indicators were harmonized to a common scale to ensure cross-country comparability, and an alternative equal-weight index was constructed to assess sensitivity to aggregation choices. The analytical framework underlying the dataset is diagnostic rather than causal. Principal Component Analysis (PCA) is employed to identify the latent structure underlying the four dimensions and to derive variance-based weights for the LESG index. The results consistently indicate a single dominant component explaining more than 80% of total variance, with all dimensions loading strongly and positively, suggesting that logistics performance, governance quality, and sustainability outcomes form a coherent readiness construct. External validation is conducted through regression analysis against GDP per capita, interpreted as an assessment of coherence rather than causality. To explore structural heterogeneity, hierarchical and k-means clustering techniques are applied to classify countries into distinct systemic readiness regimes. The dataset enables full replication of these procedures and supports comparative analysis of development readiness across countries. It is intended to be used as a transparent diagnostic tool for research and policy analysis, allowing users to examine how different structural configurations shape development capacity while remaining explicit about methodological choices and limitations.
Files
Steps to reproduce
The dataset was constructed following a structured, multi-stage workflow designed to ensure transparency, comparability, and reproducibility. Data were gathered exclusively from publicly available international databases widely used in empirical research and policy analysis. Specifically, country-level indicators were collected from the World Bank’s Logistics Performance Index (LPI), the Worldwide Governance Indicators (WGI), the Environmental Performance Index (EPI), and the Sustainable Development Goals (SDG) Index. These sources were selected because they capture complementary structural dimensions of development—logistics capability, institutional quality, and environmental and social sustainability—and are produced using standardized methodologies. The most recent available observations were used for all indicators, and countries were retained in the final sample only if complete and consistent data were available across all four dimensions, resulting in a balanced cross-sectional dataset of 123 countries. Prior to analysis, the raw indicators were harmonized to ensure numerical comparability. Indicators originally reported on a 0–100 scale (EPI and SDG Index) were retained without transformation, while the LPI (1–5 scale) and the WGI composite (−2.5 to +2.5 scale) were linearly rescaled to a common 0–100 range using monotonic transformations that preserve relative country rankings. This preprocessing step was performed using spreadsheet software and verified within SPSS to ensure consistency. No imputation procedures were applied; instead, listwise completeness was enforced to avoid introducing artificial variation. In parallel, an alternative equal-weight index was constructed by averaging the four normalized indicators, providing a benchmark for robustness assessment. All statistical analyses were conducted using IBM SPSS Statistics. Principal Component Analysis (PCA) was employed as the primary method for index construction, following standard diagnostic protocols including the Kaiser–Meyer–Olkin (KMO) measure of sampling adequacy and Bartlett’s test of sphericity. External coherence was examined through regression analysis, and regime classification was implemented using a two-stage clustering workflow combining hierarchical clustering (Ward’s method with squared Euclidean distance) and k-means clustering. All intermediate datasets, syntax-free SPSS procedures, and output files are included in the repository, allowing users to replicate each step of the workflow from data harmonization to final index construction and cluster assignment
Institutions
- Geoponoko Panepistemio Athenon Schole Epharmosmenon Oikonomikon kai Koinonikon EpistemonAttica, Athens