Testing the Presence of Outliers in Regression Models
65 Pages Posted: 8 Aug 2018 Last revised: 13 Jun 2019
Date Written: July 20, 2018
Algorithms used to detect outliers in regression models have a positive probability of finding outliers even when the data generation process has no outliers. Deriving distributional results on the rate of falsely-discovered outliers, we propose two sets of tests for the overall presence of outliers. First, tests on whether the proportion of detected outliers differs from its expected value. Second, `scaling' tests on whether the proportion of detected outliers decreases proportionally with the level of significance used for detection. We apply the tests to a model of economic growth and a difference-in-differences panel assessing the effectiveness of a carbon tax. The tests are valid for stationary as well as (stochastically) trending regressors and can readily be implemented using Autometrics in PcGive or the R-package gets.
Keywords: misspecification, outlier detection, robust estimation, iterated 1-step Huber-skip M-estimator, indicator saturation
JEL Classification: C12, C52
Suggested Citation: Suggested Citation