Researchers’ Data Analysis Choices: An Excess of False Positives?
21 Pages Posted: 21 Dec 2017 Last revised: 20 Jul 2020
Date Written: July 17, 2018
Abstract
This paper examines commonly applied methods of data analysis. Predicated on these methods, the main issue pertains to the plausibility of the studies end products, that is, their conclusions. I argue that the methods chosen often lead to unwarranted conclusions: the data analyses chosen tend to produce looked-for null-rejections even though the null may be much more plausible on prior grounds. Two aspects of data analyses applied cause obvious problems. First, researchers tend to dismiss “preliminary” findings when these contradict the expected outcome of the research question (the “screen-picking” issue). Second, researchers rarely acknowledge that small p-values should be expected when the number of observations runs into the tens of thousands (the “large N” issue). It obviously enhances the chance for a null-rejection even if, in fact, the null-hypothesis holds for all practical purposes. The discussion elaborates on these two aspects to explain why researchers generally avoid trying to mitigate false positives via supplementary data analyses. In particular, for no apparent and good reasons, most research studiously avoids the use of hold-out samples. An additional topic in this paper concerns the dysfunctional consequences of the standard (“A-journal”) publication process: it tends to buttress the use of research methods prone to false or unwarranted null-rejections.
Keywords: research methodology
Suggested Citation: Suggested Citation