The Double-edged Sword of Data Mining: Implications on Asset Pricing and Information Efficiency
67 Pages Posted: 30 Nov 2023
Date Written: November 10, 2023
Does data mining always increase price efficiency? Not necessarily. I incorporate data mining into a standard asset pricing model and identify a novel cost of complexity that arises endogenously from data mining. When a data miner explores alternative data, she faces a scarcer training history relative to potential predictors (increasing complexity) and an increasing difficulty in extracting useful signals (decreasing return in data efficacy). The cost of complexity and decreasing return in data efficacy together imply a finite optimal data mining level, such that excess data mining will lead to lower price informativeness. Empirically, I provide evidence of decreasing return in data efficacy in the context of the "factor zoo'', and I show that the release of satellite data reduces price informativeness in a difference-in-difference setting.
Keywords: Data mining, Price informativeness, Cost of complexity, Factor zoo, Alternative Data
JEL Classification: G11, G12, G14
Suggested Citation: Suggested Citation