A Multi-Year Dataset of 226 Dune Segments
Description
# Dataset Title Morphometric and Climatic Drivers of Dune Migration: A Multi-Year Dataset of 226 Dune Segments # Description This dataset contains morphological, climatic, and kinematic metrics for 226 individual dune segments monitored between 2017 and 2025 in a hyper-active aeolian system. The data was generated to analyze the drivers of dune migration using machine learning (XGBoost) and explainable AI techniques. # Methodology * **Dune Extraction:** Dune boundaries were segmented from high-resolution satellite imagery (Sentinel-2, Landsat 8/9) using a dual-stream deep learning model. * **Migration Tracking:** Migration rates were computed using the COSI-Corr algorithm to measure centroid displacement with sub-pixel accuracy over annual intervals. * **Topography:** Morphometric variables (width, slope) were derived from the ALOS World 3D-30m (AW3D30) DEM. * **Climate:** Drift Potential (DP) was calculated from ERA5-Land hourly wind data using the Fryberger method. # Variables The dataset (`Seg_Processed.xlsx`) includes the following key variables: 1. `A_Dune_ID`: Unique identifier for each dune segment. 2. `Migration_rate` (m/year): The dependent variable representing the annual rate of dune celerity. 3. `A_width` (m): Dune width, serving as a proxy for dune size/volume (inverse-size scaling). 4. `Slope_Mean` (degrees): Mean stoss slope of the dune segment. 5. `DP_Mean` (vector units): Mean Drift Potential representing wind energy. # Potential Use Cases * Benchmarking machine learning models for geomorphic prediction. * Validating physical scaling laws of sediment transport. * Analyzing non-linear interactions between dune morphology and wind regime. * Calibrating global sand transport models. # Format * File: `Seg_Processed.xlsx` (Excel format) * Rows: 226 * Columns: 22
Files
Steps to reproduce
## Dataset Description * **Source:** `Seg_Processed.xlsx` * **Total Observations:** 226 dune segments * **Temporal Span:** Multi-year migration measurements (2017-2025) * **Spatial Unit:** Individual dune segments ### Variable Selection **Target Variable:** * `Migration_rate` (m/year): The rate at which dune segments migrate downwind **Predictor Variables:** * `A_width` (m): Dune width, serving as a proxy for dune size/volume * `DP_Mean` (vector units): Mean Drift Potential, calculated from ERA5-Land wind data using the Fryberger formula * `Slope_Mean` (degrees): Mean stoss slope derived from digital elevation models **Excluded Variables:** * `NDVI_Mean`: Excluded per study design to focus on geomorphic and aerodynamic controls ### Data Quality Control * **Missing Value Treatment:** Rows with missing values in predictor or target variables were removed using listwise deletion * **Data Type Conversion:** All numeric columns converted to float64 using `pd.to_numeric()` with error coercion * **Final Sample Size:** 226 observations (no missing values detected) ## Reproducibility Statement All analyses were performed in Python 3.12 using the following key packages: * **Data Handling:** pandas 2.x, numpy 2.2 * **Machine Learning:** scikit-learn 1.x, xgboost 3.0 * **Explainability:** shap 0.x * **Visualization:** matplotlib 3.x, seaborn 0.x **Random Seeds:** Fixed at 42 for all stochastic processes (train-test split, k-means initialization) **Code Availability:** All analysis scripts (01-16) are provided in the repository.
Institutions
- Universite Sultan Moulay Slimane Faculte des Sciences et Techniques