Sniff Tests in Economics: Aggregate Distribution of Their Probability Values and Implications for Publication Bias
52 Pages Posted: 1 Oct 2018
Date Written: September 11, 2018
The increasing demand for rigor in empirical economics has led to the growing use of auxiliary tests (balance, specification, over-identification, placebo, etc.) supporting the credibility of a paper's main results. We dub these "sniff tests" because standards for passing are subjective and rejection is bad news for the author. Sniff tests offer a new window into publication bias since authors prefer them to be insignificant, the reverse of standard statistical tests. Collecting a sample of nearly 30,000 sniff tests across 60 economics journals, we provide the first estimate of their aggregate probability-value (p-value) distribution. For the subsample of balance tests in randomized controlled trials (for which the distribution of p-values is known to be uniform absent publication bias, allowing reduced-form methods to be employed) estimates suggest that 45% of failed tests remain in the "file drawer" rather than being published. For the remaining sample with an unknown distribution of p-values, structural estimates suggest an even larger file-drawer problem, as high as 91%. Fewer significant sniff tests show up in top-tier journals, smaller tables, and more recent articles. We find no evidence of author manipulation other than a tendency to overly attribute significant sniff tests to bad luck.
Keywords: Publication Bias, Null Hypothesis, Specification Test, Balance Test, Placebo Test
JEL Classification: C18, A14, B41
Suggested Citation: Suggested Citation