Dataset for "Analysis of Flood Susceptibility Using Remote Sensing and Machine Learning Techniques in Gaibandha District, Bangladesh"
Description
Floods are one of the most recurrent and devastating natural hazards in Bangladesh, particularly affecting low-lying districts such as Gaibandha. This study assesses flood hazard using a combination of remote sensing data and machine learning algorithms. A machine learning approach was adopted that included topographic, hydrologic, and environmental variables such as DEM, slope, aspect, curvature, Topographic Wetness Index (TWI), Topographic Roughness Index (TRI), drainage density, rainfall, NDVI, and distance from rivers. Sentinel-1 SAR imagery was used to delineate flood map occurred 2018, 2020, 2022 and 2024 through backscatter histogram analysis. To crosscheck the model accuracy and hazard mapping four (04) historical flood were selected. A dataset of 1000 sample points (500 flood, 500 non-flood) was used to train and validate three models: Random Forest (RF), Decision Tree (DT), and K-Nearest Neighbors (KNN). The AUC-ROC curve for the RF, DT and KNN shows the value as 0.95, 0.87 and 0.88 respectively for assessing flood hazard. Among these, the Random Forest model showed the highest accuracy and was used to generate flood hazard maps. From regression analysis, it is found that RF almost shows more than 90% accurate flood susceptible maps. Field validation further strengthened the reliability of the model. Results revealed high-risk zones predominantly in Fulchari, Saghata, and Sundarganj Upazilas. The study underscores the utility of machine learning for flood risk assessment and highlights the importance of integrating spatial data with field-based validation for improved disaster resilience planning in flood-prone regions.