The Double-edged Sword of Data Mining: Implications on Asset Pricing and Information Efficiency

67 Pages Posted: 30 Nov 2023

See all articles by Shikun Ke

Shikun Ke

Yale School of Management

Date Written: November 10, 2023

Abstract

Does data mining always increase price efficiency? Not necessarily. I incorporate data mining into a standard asset pricing model and identify a novel cost of complexity that arises endogenously from data mining. When a data miner explores alternative data, she faces a scarcer training history relative to potential predictors (increasing complexity) and an increasing difficulty in extracting useful signals (decreasing return in data efficacy). The cost of complexity and decreasing return in data efficacy together imply a finite optimal data mining level, such that excess data mining will lead to lower price informativeness. Empirically, I provide evidence of decreasing return in data efficacy in the context of the "factor zoo'', and I show that the release of satellite data reduces price informativeness in a difference-in-difference setting. 

Keywords: Data mining, Price informativeness, Cost of complexity, Factor zoo, Alternative Data

JEL Classification: G11, G12, G14

Suggested Citation

Ke, Shikun, The Double-edged Sword of Data Mining: Implications on Asset Pricing and Information Efficiency (November 10, 2023). Available at SSRN: https://ssrn.com/abstract=4633293 or http://dx.doi.org/10.2139/ssrn.4633293

Shikun Ke (Contact Author)

Yale School of Management ( email )

165 Whitney Ave
New Haven, CT 06511

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
247
Abstract Views
3,216
Rank
258,705
PlumX Metrics