Why 'Redefining Statistical Significance' Will Not Improve Reproducibility and Could Make the Replication Crisis Worse

16 Pages Posted: 21 Nov 2017

See all articles by Harry Crane

Harry Crane

Rutgers, The State University of New Jersey - Department of Statistics and Biostatistics

Date Written: November 19, 2017

Abstract

A recent proposal to "redefine statistical significance" (Benjamin, et al. Nature Human Behaviour, 2017) claims that false positive rates "would immediately improve" by factors greater than two and replication rates would double simply by changing the conventional cutoff for 'statistical significance' from P<0.05 to P<0.005. I analyze the veracity of these claims, focusing especially on how Benjamin, et al neglect the effects of P-hacking in assessing the impact of their proposal. My analysis shows that once P-hacking is accounted for the perceived benefits of the lower threshold all but disappear, prompting two main conclusions: (i) The claimed improvements to false positive rate and replication rate in Benjamin, et al (2017) are exaggerated and misleading. (ii) There are plausible scenarios under which the lower cutoff will make the replication crisis worse.

Keywords: replication crisis, p-value, p-hacking, reproducibility, statistical significance

Suggested Citation

Crane, Harry, Why 'Redefining Statistical Significance' Will Not Improve Reproducibility and Could Make the Replication Crisis Worse (November 19, 2017). Available at SSRN: https://ssrn.com/abstract=3074083 or http://dx.doi.org/10.2139/ssrn.3074083

Harry Crane (Contact Author)

Rutgers, The State University of New Jersey - Department of Statistics and Biostatistics ( email )

Piscataway, NJ
United States

Register to save articles to
your library

Register

Paper statistics

Downloads
67
rank
323,921
Abstract Views
207
PlumX Metrics