The Limits of p-Hacking: A Thought Experiment

16 Pages Posted: 16 Nov 2018 Last revised: 21 Jul 2019

Date Written: July 19, 2019

Abstract

Suppose that asset pricing factors are just p-hacked noise. How much p-hacking is required to produce the 300 factors documented by academics? I show that, if 10,000 academics generate 1 factor every minute, it takes 15 million years of p-hacking. This absurd conclusion comes from applying the p-hacking theory to published data. To fit the fat right tail of published t-stats, the p-hacking theory requires that the probability of publishing t-stats < 6.0 is infinitesimal. Thus it takes a ridiculous amount of p-hacking to publish a single t-stat. These results show that p-hacking alone cannot explain the factor zoo.

Keywords: Stock return anomalies, publication bias, data mining, multiple testing, p-hacking

JEL Classification: G10, G12

Suggested Citation

Chen, Andrew Y., The Limits of p-Hacking: A Thought Experiment (July 19, 2019). Available at SSRN: https://ssrn.com/abstract=3272572 or http://dx.doi.org/10.2139/ssrn.3272572

Andrew Y. Chen (Contact Author)

Federal Reserve Board ( email )

20th and C Streets, NW
Washington, DC 20551
United States
202-973-6941 (Phone)

HOME PAGE: http://sites.google.com/site/chenandrewy/

Register to save articles to
your library

Register

Paper statistics

Downloads
551
Abstract Views
2,412
rank
48,978
PlumX Metrics