Quality Checks on Granular Banking Data: An Experimental Approach Based on Machine Learning?

23 Pages Posted: 28 May 2020

See all articles by Fabio Zambuto

Fabio Zambuto

Bank of Italy

Maria Rosaria Buzzi

Bank of Italy

Giuseppe Costanzo

Bank of Italy

Marco Di Lucido

Bank of Italy

Barbara La Ganga

Bank of Italy

Pasquale Maddaloni

Bank of Italy

Fabio Papale

Bank of Italy

Emiliano Svezia

Bank of Italy

Date Written: May 28, 2020

Abstract

We propose a new methodology, based on machine learning algorithms, for the automatic detection of outliers in the data that banks report to the Bank of Italy. Our analysis focuses on granular data gathered within the statistical data collection on payment services, in which the lack of strong ex ante deterministic relationships among the collected variables makes standard diagnostic approaches less powerful. Quantile regression forests are used to derive a region of acceptance for the targeted information. For a given level of probability, plausibility thresholds are obtained on the basis of individual bank characteristics and are automatically updated as new data are reported. The approach was applied to validate semi-annual data on debit card issuance received from reporting agents between December 2016 and June 2018. The algorithm was trained with data reported in previous periods and tested by cross-checking the identified outliers with the reporting agents. The method made it possible to detect, with a high level of precision in term of false positives, new outliers that had not been detected using the standard procedures.

Keywords: banking data, data quality management, outlier detection, machine learning, quantile regression, random forests

JEL Classification: C18, C81, G21

Suggested Citation

Zambuto, Fabio and Buzzi, Maria Rosaria and Costanzo, Giuseppe and Di Lucido, Marco and La Ganga, Barbara and Maddaloni, Pasquale and Papale, Fabio and Svezia, Emiliano, Quality Checks on Granular Banking Data: An Experimental Approach Based on Machine Learning? (May 28, 2020). Bank of Italy Occasional Paper No. 547, Available at SSRN: https://ssrn.com/abstract=3612688 or http://dx.doi.org/10.2139/ssrn.3612688

Fabio Zambuto (Contact Author)

Bank of Italy ( email )

Via Nazionale 91
Rome, 00184
Italy

Maria Rosaria Buzzi

Bank of Italy ( email )

Via Nazionale 91
Rome, 00184
Italy

Giuseppe Costanzo

Bank of Italy ( email )

Via Nazionale 91
Rome, 00184
Italy

Marco Di Lucido

Bank of Italy ( email )

Via Nazionale 91
Rome, 00184
Italy

Barbara La Ganga

Bank of Italy ( email )

Via Nazionale 91
Rome, 00184
Italy

Pasquale Maddaloni

Bank of Italy ( email )

Via Nazionale 91
Rome, 00184
Italy

Fabio Papale

Bank of Italy ( email )

Via Nazionale 91
Rome, 00184
Italy

Emiliano Svezia

Bank of Italy ( email )

Via Nazionale 91
Rome, 00184
Italy

Here is the Coronavirus
related research on SSRN

Paper statistics

Downloads
9
Abstract Views
130
PlumX Metrics