Machine Learning and Sampling Scheme: An Empirical Study of Money Laundering Detection

Computational Economics, Forthcoming

39 Pages Posted: 26 Apr 2018 Last revised: 30 Oct 2018

See all articles by Yan Zhang

Yan Zhang

Government of the United States of America - Office of the Comptroller of the Currency (OCC)

Peter Trubey

University of California, Santa Cruz

Date Written: September 13, 2018

Abstract

This paper studies the interplay of machine learning and sampling scheme in an empirical analysis of money laundering detection algorithms. Using actual transaction data provided by a U.S. financial institution, we study five major machine learning algorithms including Bayes logistic regression, decision tree, random forest, support vector machine, and artificial neural network. As the incidence of money laundering events is rare, we apply and compare two sampling techniques that increase the relative presence of the events. Our analysis reveals potential advantages of machine learning algorithms in modeling money laundering events. This paper provides insights into the use of machine learning and sampling schemes in money laundering detection specifically, and classification of rare events in general.

Keywords: Bootstrap, Machine Learning, Money Laundering, Rare Event, Sampling Scheme

JEL Classification: G21, G28

Suggested Citation

Zhang, Yan and Trubey, Peter, Machine Learning and Sampling Scheme: An Empirical Study of Money Laundering Detection (September 13, 2018). Computational Economics, Forthcoming. Available at SSRN: https://ssrn.com/abstract=3161436 or http://dx.doi.org/10.2139/ssrn.3161436

Yan Zhang (Contact Author)

Government of the United States of America - Office of the Comptroller of the Currency (OCC) ( email )

400 7th St. SW
Washington, DC 20219
United States
202-6495492 (Phone)

Peter Trubey

University of California, Santa Cruz ( email )

1156 High St
Santa Cruz, CA 95064
United States

Register to save articles to
your library

Register

Paper statistics

Downloads
171
Abstract Views
563
rank
179,032
PlumX Metrics