Spelling Errors and Non-Standard Language in Peer-to-Peer Loan Applications and the Borrower’s Probability of Default
Lee, Michelle Seng Ah and Jatinder Singh. "Spelling Errors and Non-Standard Language in Peer-to-Peer Loan Applications and the Borrower’s Probability of Default." In Proceedings of Credit Scoring and Credit Control Conference XVII. 2021.
9 Pages Posted: 18 Jun 2020 Last revised: 24 May 2021
Date Written: May 25, 2020
As peer-to-peer (P2P) lenders evaluate the potential risk of each loan application, they may rely on subjective judgement given qualitative information. Academics have found loan approval rates to be associated with the borrower's personality traits, social capital, and appearances. However, the association between a borrower's language and probability of default has yet be considered. In this paper, we show that there are statistically significant linguistic differences in the free-text Lending Club loan descriptions between those that default and those that are fully paid. By newly engineering features on non-standard language and spelling errors, using natural language processing techniques and running multivariate logistic regression analyses, we find that the usage of slang words, short-hand abbreviations, and spelling errors are all associated with a higher likelihood of default when controlling for the borrower's income and loan amount. However, whether the errors were orthographic or phonological and the egregiousness of the error do not affect the probability of default. Finally, we discuss the ethical implications of potential discriminatory bias given the association between poor spelling and disability status (e.g. dyslexia), national origin (i.e. English language familiarity), and personality traits (carelessness), laying the foundation for future work on bias in P2P lending, and other scenarios involving applicant-oriented risk assessments.
Keywords: peer-to-peer lending, discriminatory lending, alternative credit, credit risk, algorithmic bias, spelling error, non-standard language
Suggested Citation: Suggested Citation