A Hybrid Physics-Xgboost Framework for Dam Breach Parameter Prediction with Missing Data
30 Pages Posted: 20 May 2025
There are 2 versions of this paper
A Hybrid Physics-Xgboost Framework for Dam Breach Parameter Prediction with Missing Data
Abstract
Study RegionGlobally, dam breach-prone regions, often characterized by aging infrastructure, densely populated downstream areas, and increasing extreme weather events, pose significant challenges to accurate risk assessment and sustainable mitigation strategies.Study focusAccurate prediction of dam breach parameters is critical for disaster risk mitigation, yet conventional methods face limitations due to data scarcity and nonlinear hydromechanical interactions. This study proposes a hybrid framework integrating the physical mechanisms of the BREACH model with data-driven machine learning (ML) to address missing-data challenges. Leveraging Ward’s clustering method, key parameters (e.g., dam height, reservoir storage, breach width) are weighted based on their hydro-mechanical coupling effects derived from the BREACH model, enabling the development of a physics-driven empirical formula for missing data imputation. The framework combines reconstructed parameters from 152 incomplete cases with 40 complete dam-break cases, forming a robust dataset of 192 samples.New hydrological insights for the regionValidated against traditional approaches, the hybrid Physics-XGBoost model achieves superior performance: it improves peak discharge prediction accuracy by 22% (R²=0.915 vs. 0.73 for mean imputation) and reduces errors by 16% compared to purely physics-based empirical formulas. The integration of BREACH-derived physical weights and clustering-driven relationships enhances interpretability while resolving the accuracy-efficiency trade-off in dam breach simulations. This advancement supports real-time emergency decision-making and infrastructure resilience enhancement, offering a paradigm shift in data-scarce hydraulic disaster modeling.
Keywords: Dam breach parameters, hybrid framework, physics-driven, empirical formula, XGBoost, Ward's clustering method, missing data
Suggested Citation: Suggested Citation