The Impact of Data Mining on Information Disclosure by Regulatory Agencies: With an Application to Redlining

56 Pages Posted: 20 Jun 2018 Last revised: 18 Oct 2018

See all articles by W.C. Bunting

W.C. Bunting

Fox School of Business and Management

Date Written: August 20, 2018

Abstract

Data mining techniques can be used to locate statistical outliers that are incorrectly characterized as evidence of unlawful conduct. Using home mortgage loan data made publicly available by financial regulators, a simple data mining exercise finds that approximately three percent of all lender-MSA pairs (or approximately seven to nine percent of all lending institutions) flagged as having redlined minority neighborhoods is attributable to a failure to correct for the multiple hypothesis testing problem. The false positive rate does not fully explain, however, the estimated high frequency of statistical redlining. Three possible models of information disclosure by regulatory agencies are considered: (1) full information, (2) no information, and (3) limited information. Under a limited information model, litigation serves to correctly implement statistical hypothesis testing: a plaintiff must formulate a hypothesis prior to examination of the data and obtains the information necessary to test this hypothesis only through discovery.

Keywords: Data Mining, Multiple Hypothesis Testing, Redlining, Information Disclosure

JEL Classification: C55, K11

Suggested Citation

Bunting, William, The Impact of Data Mining on Information Disclosure by Regulatory Agencies: With an Application to Redlining (August 20, 2018). Fox School of Business Research Paper No. 18-034. Available at SSRN: https://ssrn.com/abstract=3199804 or http://dx.doi.org/10.2139/ssrn.3199804

William Bunting (Contact Author)

Fox School of Business and Management ( email )

Philadelphia, PA 19122
United States

Register to save articles to
your library

Register

Paper statistics

Downloads
22
Abstract Views
219
PlumX Metrics