19 Pages Posted: 3 Mar 2012 Last revised: 26 Mar 2013
Date Written: May 5, 2012
Tests of statistical significance are routinely used in many research studies. However, there are critics of this paradigm, and also a lingering sense that critical test levels are somewhat arbitrary. This paper adds to the literature by determining the timing and level of acceptance of common tests of statistical inference. Using the archives of the Royal Society, we examined 574 research studies published between 1926 and 1997, by which point adoption was virtually complete. We find that the rate and level of adoption rises over time, in a manner broadly consistent with the theoretical literature on the adoption rate of innovations. We detect the presence of several influences on the rate of adoption, which may include prior custom, the nature of empirical research topics being reported, the increasing ease of computer processing, and possibly journal editorial policies. We find that confidence/significance testing has been adopted by a majority of the scientific community for over 50 years; the customary reliance on 95 percent confidence (five percent significance) is upheld by the data; and that confidence intervals and critical significance levels are both widely reported and often together in recent decades. For historians of science these data suggest that neither Fisher nor Pearson conclusively “won” their private war. The study sheds new light on an issue of considerable practical importance, the admissibility of statistical evidence in most courts in the United States.
Keywords: statistics, courtroom, supreme court, Fisher, Pearson, adoption rate, significance levels, confidence levels, history of science
JEL Classification: B2, B4, C12, K00
Suggested Citation: Suggested Citation
Gulley, David A., The Adoption of Statistical Tests by Natural Scientists: An Empirical Analysis (May 5, 2012). Available at SSRN: https://ssrn.com/abstract=2012659 or http://dx.doi.org/10.2139/ssrn.2012659