A Methodology for Automatised Outlier Detection in High-Dimensional Datasets: An Application to Euro Area Banks’ Supervisory Data

57 Pages Posted: 1 Aug 2018

See all articles by Matteo Farnè

Matteo Farnè

Dipartimento Scienze Statistiche

Angelos T. Vouldis

Bank of Greece; European Central Bank (ECB)

Date Written: July 27, 2018

Abstract

Outlier detection in high-dimensional datasets poses new challenges that have not been investigated in the literature. In this paper, we present an integrated methodology for the identification of outliers which is suitable for datasets with higher number of variables than observations. Our method aims to utilise the entire relevant information present in a dataset to detect outliers in an automatized way, a feature that renders the method suitable for application in large dimensional datasets. Our proposed five-step procedure for regression outlier detection entails a robust selection stage of the most explicative variables, the estimation of a robust regression model based on the selected variables, and a criterion to identify outliers based on robust measures of the residuals' dispersion. The proposed procedure deals also with data redundancy and missing observations which may inhibit the statistical processing of the data due to the ill-conditioning of the covariance matrix. The method is validated in a simulation study and an application to actual supervisory data on banks’ total assets.

Keywords: outlier detection, robust regression, variable selection, high dimension, missing data, banking data

JEL Classification: C18, C81, G21

Suggested Citation

Farnè, Matteo and Vouldis, Angelos T. and Vouldis, Angelos T., A Methodology for Automatised Outlier Detection in High-Dimensional Datasets: An Application to Euro Area Banks’ Supervisory Data (July 27, 2018). ECB Working Paper No. 2171, Available at SSRN: https://ssrn.com/abstract=3224300 or http://dx.doi.org/10.2139/ssrn.3224300

Matteo Farnè

Dipartimento Scienze Statistiche ( email )

Via Zamboni, 33
Bologna, 40126
Italy

Angelos T. Vouldis (Contact Author)

European Central Bank (ECB) ( email )

Sonnemannstrasse 22
Frankfurt am Main, 60314
Germany

Bank of Greece ( email )

21 E. Venizelos Avenue
GR 102 50 Athens
Greece

HOME PAGE: http://www.bankofgreece.gr

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
102
Abstract Views
616
Rank
476,655
PlumX Metrics