Will They Repay Their Debt? Identification of Borrowers Likely to Be Charged Off

Management & Marketing, Forthcoming

14 Pages Posted: 29 Aug 2020

See all articles by Raluca Caplescu

Raluca Caplescu

The Bucharest University of Economic Studies

Ana-Maria Panaite

The Bucharest University of Economic Studies

Daniel Traian Pele

Bucharest University of Economic Studies; Romanian Academy - Institute for Economic Forecasting

Vasile Alecsandru Strat

The Bucharest University of Economic Studies

Date Written: July 22, 2020

Abstract

Recent increase in P2P lending prompted for development of models to separate good and bad clients to mitigate risks both for lenders and for the platforms. The rapidly increasing body of literature provides several comparisons between various models. Among the most frequently employed ones are logistic regression, SVM, neural networks and decision tree-based ones. Among them, logistic regression has proved to be a strong candidate both because its good performance and due to its high explainability. The present paper aims to compare four pairs of models (for imbalanced and under-sampled data) meant to predict charged off clients by optimizing f1 score. We found that, if the data is balanced, Logistic Regression, both simple and with Stochastic Gradient Descent, outperforms LightGBM and K-Nearest Neighbors in optimizing f1 score. We chose this metric as it provides balance between the interests of the lenders and those of the platform. Loan term, DTI and number of accounts were found to be important positively related predictors of risk of charge off. At the other end of the spectrum, by far the strongest impact on charge off probability is that of the FICO score. The final number of features retained by the two models differs very much, because, although both models use Lasso for feature selection, Stochastic Gradient Descent Logistic Regression uses a stronger regularization. The analysis was performed using Python (numpy, pandas, sklearn and imblearn).

Keywords: peer-to-peer lending, creditworthiness, Logistic Regression, KNN, LightGBM

Suggested Citation

Caplescu, Raluca and Panaite, Ana-Maria and Pele, Daniel Traian and Strat, Vasile Alecsandru, Will They Repay Their Debt? Identification of Borrowers Likely to Be Charged Off (July 22, 2020). Management & Marketing, Forthcoming, Available at SSRN: https://ssrn.com/abstract=3658606

Raluca Caplescu (Contact Author)

The Bucharest University of Economic Studies ( email )

Romania

Ana-Maria Panaite

The Bucharest University of Economic Studies

Romania

Daniel Traian Pele

Bucharest University of Economic Studies

Piata Romana nr. 6
Bucharest
Romania

Romanian Academy - Institute for Economic Forecasting ( email )

Calea 13 Septembrie nr. 13
Bucharest, 050711
Romania

Vasile Alecsandru Strat

The Bucharest University of Economic Studies

Romania

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
80
Abstract Views
468
Rank
672,257
PlumX Metrics