Developing a Data Science Approach to Detecting Income Fraud for the Peer to Peer Loan Industry

13 Pages Posted: 9 Sep 2016

See all articles by David Keough

David Keough

Lipscomb University, Students

Nicolaus Enko

Lipscomb University, Students

Brian Shake

Lipscomb University, Students

Date Written: September 7, 2016

Abstract

Personal loans can be obtained by borrowers from very different types of lending institution. The most common are a traditional loan institutions, payday lenders, or a Peer to Peer (P2P) lending brokers. P2P lending companies do not loan the money directly. They link the borrower to a lender and provide the lender with the borrower’s income which is not usually verified. The P2P lenders collects fees based on the transaction and financially benefits from a higher number of introductions of borrowers and lenders. P2P lending is becoming more popular among borrowers because the pay highest interest rates are much lower than payday lenders and loans require less verification of income and assets than traditional loan institutions. A higher reported income with P2P lenders can result in a larger loan for the borrower and thus more profit and fees generated for the P2P lender. If the loan defaults due to an overstated or fraudulently reported income by the borrow, the P2P lender does not suffer, it is the lender that was matched to the borrower by the P2P lender that will incur the financial loss. This paper focuses on proposing a data science approach to detecting loan applicants that provide fraudulent income data to P2P lenders. The data obtained for this study contained 887,379 observations and 74 variables of loan applicants from the P2P loan company, Lending Club. The initial observations of this data showed that unverified loans make up 23% of the defaulted loans while verified and source verified loans made up about 77%. Described within this paper is how the data set for analysis was obtained and prepared for analysis, the initial findings, the proposed data science approach to fully analyzing this data, and the significance of the lending industry, both traditional and P2P, with a method of detecting fraudulent income reported on loan applications. Models generated from this analysis could be incorporated by lenders into their applications along with research in this area should improve the P2P lending industry by increasing the detection of fraudulent income reported on loan applications.

Keywords: Detect, Financial, Fraud Detection, Housing Crisis, Kaggle.com, Lending, Linearted

JEL Classification: G2, G20, G21, G23, G24, G28

Suggested Citation

Keough, David and Enko, Nicolaus and Shake, Brian, Developing a Data Science Approach to Detecting Income Fraud for the Peer to Peer Loan Industry (September 7, 2016). Available at SSRN: https://ssrn.com/abstract=2836134 or http://dx.doi.org/10.2139/ssrn.2836134

David Keough

Lipscomb University, Students ( email )

1 University Park Dr
Nashville, TN 37204
United States

Nicolaus Enko (Contact Author)

Lipscomb University, Students ( email )

1 University Park Dr
Nashville, TN 37204
United States

Brian Shake

Lipscomb University, Students ( email )

1 University Park Dr
Nashville, TN 37204
United States

Here is the Coronavirus
related research on SSRN

Paper statistics

Downloads
202
Abstract Views
1,432
rank
156,347
PlumX Metrics