Submarine Landslide Risk Concerning Military Conflicts in the Strait of Hormuz and Gulf of Oman
Description
Environmental Features This folder contains the raster files of the six environmental variables used as input features for the machine learning models: Water Depth (`Depth.tif`), Slope (`Slope.tif`), Roughness (`Roughness.tif`), Curvature (`Curvature.tif`), Distance to Fault (`Fault.tif`), and Peak Ground Acceleration (`PGA.tif`). The `Depth.xyz` file contains the raw coordinate and depth data. All raster files are provided in GeoTIFF format with a spatial resolution of approximately 450 m (15 arc seconds), covering the study area in the Strait of Hormuz and the northern Gulf of Oman. Model Weights This folder contains the trained model weights for the four ensemble learning algorithms evaluated in this study: `RandomForest_model.pkl`, `XGBoost_model.pkl`, `LightGBM_model.pkl`, and `CatBoost_model.pkl`. The models were trained using the environmental features listed above and the global submarine landslide inventory. The weights are saved in Python pickle (.pkl) format and can be loaded using the corresponding libraries (scikit-learn, XGBoost, LightGBM, CatBoost). Prediction Results This folder contains the submarine landslide susceptibility prediction results generated by the optimal Random Forest model under nine scenarios: Background (no explosion), the Strait of Hormuz (10 kt), and the Gulf of Oman (1 t, 10 t, 100 t, 1 kt, 10 kt, 100 kt, and 1 Mt). Each raster file (`Prediction_*.tif`) contains the predicted landslide susceptibility probability (ranging from 0 to 1) at a spatial resolution of approximately 450 m in GeoTIFF format. Earthquake Location This folder contains 8 text files (`Gulf of Oman PGA_*.txt` and `Strait of Hormuz PGA_100kt.txt`) specifying the location (longitude, latitude), magnitude, and focal depth of simulated explosions. Each file contains a single line of data used as input for PGA calculation and subsequent landslide susceptibility modeling. Dataset This folder contains the training dataset (`Training Dataset.txt`) and prediction datasets (`predicted dataset_*.txt`) used for model development and scenario simulations. All files are tab-delimited text files with consistent feature columns, serving as input for landslide susceptibility prediction. Python Scripts The following Python scripts are located in the root directory and were used for data processing, model training, and prediction: - `Nuclear_explosion_seismology.py` – Calculates the equivalent earthquake magnitude for explosions at different locations and yields. - `Machine_learning_classification_model_training_and_prediction.py` – Performs model training and generates landslide susceptibility predictions. - `Generate_Prediction_Dataset.py` – Prepares the input datasets required for prediction. - `Calculate_PGA.py` – Computes the Peak Ground Acceleration (PGA) across the entire study area following simulated explosions.
Files
Steps to reproduce
1. Environmental Features: The six environmental variables (water depth, slope, roughness, curvature, distance to fault, and PGA) were derived from the GEBCO_2023 bathymetric grid, GEM Global Active Faults Database, and STEAD earthquake catalog. All raster processing was performed using ArcMap Spatial Analyst tools. The raw coordinate and depth data are provided in `Depth.xyz`. 2. Earthquake Location: Explosion scenarios were defined by specifying longitude, latitude, magnitude, and focal depth in individual text files within the `Earthquake Location` folder. These parameters were used as input for PGA calculation. 3. PGA Calculation: The `Calculate_PGA.py` script computes the Peak Ground Acceleration across the entire study area for each explosion scenario. Output PGA raster files are saved in GeoTIFF format. 4. Prediction Dataset Generation: The `Generate_Prediction_Dataset.py` script extracts feature values from all environmental raster layers at each prediction point and compiles them into tab-delimited text files within the `Dataset` folder. 5. Model Training: Four ensemble machine learning algorithms (Random Forest, XGBoost, LightGBM, and CatBoost) were trained on `Training Dataset.txt` (10,003 samples) with a 4:1 train–test split. The training and evaluation workflow is implemented in `Machine_learning_classification_model_training_and_prediction.py`. Model weights are saved in the `Model Weights` folder in Python pickle (.pkl) format. 6. Equivalent Magnitude Conversion: For explosion scenarios, equivalent seismic magnitudes were calculated using the empirical relationship \(M = 4.0 + 0.75 \cdot \log_{10}(Y)\), where \(Y\) is the TNT-equivalent yield in kilotons. This conversion is implemented in `Nuclear_explosion_seismology.py`. 7. Prediction Results: The trained Random Forest model was applied to each prediction dataset to generate landslide susceptibility maps. Output raster files are saved in the `Prediction Results` folder in GeoTIFF format with a spatial resolution of approximately 450 m.
Institutions
- Laoshan LaboratoryShandong, Qingdao