Credit Default Prediction from User-Generated Text in Peer-to-Peer Lending Using Deep Learning

31 Pages Posted: 15 Feb 2020 Last revised: 30 Aug 2021

See all articles by Johannes Kriebel

Johannes Kriebel

University of Muenster

Lennart Stitz

University of Muenster

Date Written: January 13, 2020


Digital technologies produce vast amounts of unstructured data that can be stored and accessed by traditional banks and fintech companies. We employ deep learning and several other techniques to extract credit-relevant information from user-generated text on Lending Club. Our results show that even short pieces of user-generated text can improve credit default predictions significantly. The importance of text is further supported by an information fusion analysis. Compared with other approaches that use text, deep learning outperforms them in almost all cases. However, machine learning models combined with word frequencies or topic models also extract substantial credit-relevant information. A comparison of six deep neural network architectures, including state-of-the-art transformer models, finds that the architectures mostly provide similar performance. This means that simpler methods (such as average embedding neural networks) offer performance comparable to more complex methods (such as the transformer networks BERT and RoBERTa) in this credit scoring setting.

Keywords: Peer-to-peer lending, deep learning, textual data, credit risk

JEL Classification: G21, C14, C45

Suggested Citation

Kriebel, Johannes and Stitz, Lennart, Credit Default Prediction from User-Generated Text in Peer-to-Peer Lending Using Deep Learning (January 13, 2020). Available at SSRN: or

Johannes Kriebel (Contact Author)

University of Muenster ( email )

Universitätsstraße 14-16
Münster, D-48143

Lennart Stitz

University of Muenster ( email )

Schlossplatz 2
Muenster, D-48149

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Abstract Views
PlumX Metrics