Improvements in PD Models: A Case-Study Approach

Proceedings of the 15th International Conference on Business Excellence2021

20 Pages Posted: 20 Apr 2021

See all articles by Raluca Caplescu

Raluca Caplescu

The Bucharest University of Economic Studies

Simona Cojocea

University of Bucharest

Daniel Traian Pele

Bucharest University of Economic Studies; Romanian Academy - Institute for Economic Forecasting

Vasile Alecsandru Strat

The Bucharest University of Economic Studies

Date Written: March 19, 2021

Abstract

Models for estimating the probability of default are widely used in the business throughout the lending process, starting as early as the application stage, where they play an important role in loan approval status. For model soundness and performance, ensuring adequate data quality is essential. Identifying outliers, analyzing their impact and choosing the right method to treat them is a necessary stage of preprocessing, which is often overlooked in practice for a variety of reasons, an important one being insufficient data. Given the inherent imbalance of the loan portfolio with regard to default status, elimination of outliers is seldom feasible. The current widely accepted approach is based on binning and weight of evidence. Usually two types of binning are tested, namely bucket and quantile. While the latter is robust to outlier presence, the former is not. Both approaches lead to the discretization of the continuous variable they are applied on. This causes information loss both in terms of variation given by individual values and in terms of distance between the various observation points on a certain variable. In the present paper, we explore the opportunity of using other methods for dealing with outlier presence and we describe their advantages and disadvantages in the context of probability of default estimation for credit risk. We conclude that, aside from quantile binning, not dealing with outliers in case of very large datasets or winsorizing are also effective. More importantly, several methods should be considered and tested for each variable in order to find the optimal balance between altering the data and reducing variance.

Keywords: PD Models, Outlier Treatment, Impact Analysis, P2P Lending, Lending Club

Suggested Citation

Caplescu, Raluca and Cojocea, Manuela-Simona and Pele, Daniel Traian and Strat, Vasile Alecsandru, Improvements in PD Models: A Case-Study Approach (March 19, 2021). Proceedings of the 15th International Conference on Business Excellence2021, Available at SSRN: https://ssrn.com/abstract=3821829

Raluca Caplescu (Contact Author)

The Bucharest University of Economic Studies ( email )

Romania

Manuela-Simona Cojocea

University of Bucharest ( email )

14 Academiei St.
Bucharest, Bucuresti 70109
Romania

Daniel Traian Pele

Bucharest University of Economic Studies

Piata Romana nr. 6
Bucharest
Romania

Romanian Academy - Institute for Economic Forecasting ( email )

Calea 13 Septembrie nr. 13
Bucharest, 050711
Romania

Vasile Alecsandru Strat

The Bucharest University of Economic Studies

Romania

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
62
Abstract Views
386
Rank
769,647
PlumX Metrics