Probabilistic Earthquake Forecasting in South Asia: Trends, Challenges, and Future Prospects
Abstract
Earthquake forecasting remains one of the most challenging problems in geoscience, particularly in seismically active regions like South Asia. This paper presents a comprehensive machine-learning framework for probabilistic earthquake forecasting, integrating traditional machine-learning models with seismic hazard analysis. We utilize historical earthquake data from the USGS catalog (1970-2024) and fault characteristics from the GEM Active Fault Database to develop predictive models, including logistic regression, support vector machines (SVM), decision trees, and random forests. Our methodology incorporates feature engineering of seismic parameters (Gutenberg-Richter a/b values, slip rates, and monthly event counts). It addresses critical challenges of data sparsity and imbalanced class distribution through synthetic minority oversampling techniques. The random forest model achieved the highest performance (88 % accuracy, 0.86 F1-score), demonstrating significant improvement over conventional Probabilistic Seismic Hazard Analysis (PSHA) methods. We discuss practical implementation challenges for early warning systems in South Asia's resource-constrained environments and propose a hybrid architecture combining machine learning with physics-based models. This research contributes methodological advances in seismic forecasting and practical insights for disaster risk reduction in developing regions.