Combining Multiple Probability Predictions in the Presence of Class Imbalance to Discriminate Between Potential Bad and Good Borrowers in the Peer-to-Peer Lending Market

Journal of Behavioral and Experimental Finance, Forthcoming

24 Pages Posted: 10 Feb 2020

Date Written: January 16, 2020

Abstract

Credit risk scoring predictions represent an effective guide for lenders to discriminate between potential good (who will repay the loan) and bad (who will default) borrowers in the online social lending market. A common characteristic of such a market is a lower percentage of defaulted borrowers than non-defaulted borrowers; thus, the sample is class imbalanced. Class imbalance may affect the accuracy of default predictions, as classifiers tend to be biased towards the majority class (good borrowers). We analyse the default prediction performance when combining class rebalancing methods with different regression and machine learning techniques. We also propose to combine multiple probability predictions to improve the predictive performance. The analysis is based on a book of loans (with a three-year term) funded in the 2010-2015 period though the online platform of Lending Club. The results show that some measures of predictive accuracy tend to improve when the scoring models are trained using a rebalanced, rather than an imbalanced sample, except when the extreme gradient boosting approach is applied. Finally, we find that combining multiple probability predictions via regularised logistic regression may help to improve the predictive accuracy.

Keywords: Class imbalance; Machine learning; Combining multiple probability predictions; Credit risk scoring prediction; Peer-to-peer lending

JEL Classification: D82; G4

Suggested Citation

Zanin, Luca, Combining Multiple Probability Predictions in the Presence of Class Imbalance to Discriminate Between Potential Bad and Good Borrowers in the Peer-to-Peer Lending Market (January 16, 2020). Journal of Behavioral and Experimental Finance, Forthcoming . Available at SSRN: https://ssrn.com/abstract=3520605

Here is the Coronavirus
related research on SSRN

Paper statistics

Downloads
7
Abstract Views
64
PlumX Metrics