On Boosting: Theory and Applications
39 Pages Posted: 20 Jun 2019
Date Written: June 11, 2019
Abstract
We provide an overview of two commonly used boosting methodologies. We start with the description of different implementations as well as the statistical theory behind selected algorithms which are widely used by the machine learning community, then we discuss a case study focusing on the prediction of car insurance claims in a fixed future time interval. The results of the case study show that, overall, XGBoost performs better than AdaBoost and it shows best performance when shallow trees, moderate shrinking, the number of iterations increased with respect to default as well as subsampling of both features and training data points are considered.
Keywords: machine learning, boosting, predictive modeling, R, Python, car insurance, Kaggle, Porto Seguro, AdaBoost, XGBoost
Suggested Citation: Suggested Citation