Imbalanced Learning for Insurance Using Modified Loss Functions in Tree-Based Models

36 Pages Posted: 6 May 2022

See all articles by Changyue Hu

Changyue Hu

University of Illinois at Urbana-Champaign - Department of Mathematics

Zhiyu Quan

The University of Illinois at Urbana-Champaign

Wing Fung Chong

Heriot-Watt University - Department of Actuarial Mathematics and Statistics

Date Written: April 1, 2022

Abstract

Tree-based models have gained momentum in insurance claim loss modeling; however, the point mass at zero and the heavy tail of insurance loss distribution pose the challenge to apply conventional methods directly to claim loss modeling. With a simple illustrative dataset, we first demonstrate how the traditional tree-based algorithm's splitting function fails to cope with a large proportion of data with zero responses. To address the imbalance issue presented in such loss modeling, this paper aims to modify the traditional splitting function of Classification and Regression Tree (CART). In particular, we propose two novel modified loss functions, namely, the weighted sum of squared error and the sum of squared Canberra error. These modified loss functions impose a significant penalty on grouping observations of non-zero response with those of zero response at the splitting procedure, and thus significantly enhance their separation. Finally, we examine and compare the predictive performance of such modified tree-based models to the traditional model on synthetic datasets that imitate insurance loss. The results show that such modification leads to substantially different tree structures and improved prediction performance.

Keywords: Predictive model of insurance claims, imbalanced learning, custom loss, Canberra distance, regression tree, tree-based algorithms.

Suggested Citation

Hu, Changyue and Quan, Zhiyu and Chong, Wing Fung, Imbalanced Learning for Insurance Using Modified Loss Functions in Tree-Based Models (April 1, 2022). Available at SSRN: https://ssrn.com/abstract=4086867 or http://dx.doi.org/10.2139/ssrn.4086867

Changyue Hu

University of Illinois at Urbana-Champaign - Department of Mathematics ( email )

1409 W. Green Street
Urbana, IL 61801
United States

Zhiyu Quan (Contact Author)

The University of Illinois at Urbana-Champaign ( email )

Wing Fung Chong

Heriot-Watt University - Department of Actuarial Mathematics and Statistics ( email )

Edinburgh, Scotland EH14 4AS
United Kingdom

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
15
Abstract Views
99
PlumX Metrics