53 Pages Posted: 7 Dec 2009
Date Written: December 2009
This paper develops a specification of the credit scoring model with high discriminatory power to analyze data on loans at the retail banking market. Parametric and non- parametric approaches are employed to produce three models using logistic regression (parametric) and one model using Classification and Regression Trees (CART, nonparametric). The models are compared in terms of efficiency and power to discriminate between low and high risk clients by employing data from a new European Union economy. We are able to detect the most important characteristics of default behavior: the amount of resources the client has, the level of education, marital status, the purpose of the loan, and the number of years the client has had an account with the bank. Both methods are robust: they found similar variables as determinants. We therefore show that parametric as well as non-parametric methods can produce successful models. We are able to obtain similar results even when excluding a key financial variable (amount of own resources). The policy conclusion is that socio-demographic variables are important in the process of granting credit and therefore such variables should not be excluded from credit scoring model specification.
Keywords: credit scoring, discrimination analysis, banking sector, pattern recognition, retail loans, CART, European Union
JEL Classification: B41, C14, D81, G21, P43
Suggested Citation: Suggested Citation
Kocenda, Evzen and Vojtek, Martin, Default Predictors and Credit Scoring Models for Retail Banking (December 2009). CESifo Working Paper Series No. 2862. Available at SSRN: https://ssrn.com/abstract=1519792