Machine Learning-Enhanced Conformal Prediction Approach for Road Traffic Accident Severity Assessment: A Case Study of Rome
27 Pages Posted: 10 Jan 2024
Abstract
Road traffic accidents are a significant global health concern with far-reaching economic consequences. In an innovative bid to address this issue, our study predicts accident severity in Rome, leveraging a comprehensive dataset from 2006 to 2022, a first in the literature. We analyzed multiple factors, including weather conditions, road and vehicle conditions, types of accidents, and time-related aspects.Distinguishing our study, we applied one-hot encoding to categorical variables, demonstrating superior model performance over traditional label encoding. Additionally, we employed the Synthetic Minority Over-sampling Technique (SMOTE) to handle data imbalance and provided a detailed analysis of its impacts on model performance.Our key innovation lies in implementing conformal prediction to quantify prediction uncertainty. Given the prevalent skewness in traffic accident datasets, this technique enhances decision-making reliability and precision.We deployed a series of machine learning models, with the Extreme Gradient Boost (XGBoost) algorithm outperforming others in predicting injury severity, boasting a remarkable 77% accuracy rate. Implementing SHapley Additive exPlanations (SHAP) ensured model interpretability, underscoring the type of vehicles involved, the nature of the accident, and road shape as the most influential factors.In conclusion, our approach combining XGBoost, one-hot encoding, SMOTE, and conformal prediction provides a comprehensive, efficient, and transparent method for predicting road accident severity. Incorporating conformal prediction offers critical insights into model uncertainty, significantly aiding decision-making in road traffic safety. Additionally, SHAP analysis highlights the key factors contributing to accident severity, guiding focused preventive strategies.
Keywords: Road Accident Severity, Extreme Gradient Boosting, One-Hot Encoding, Synthetic Minority Over-Sampling Technique, Conformal Prediction, SHAP
Suggested Citation: Suggested Citation