Does Regression Produce Representative Estimates of Causal Effects?
Peter M. Aronow
Yale University - Department of Political Science
New York University (NYU) - Wilf Family Department of Politics
January 31, 2014
EPSA 2013 Annual General Conference Paper 585
It is well-known that, with an unrepresentative sample, the estimate of a causal effect may fail to characterize how effects operate in the population of interest. What is less well understood is that conventional estimation practices for observational studies may produce the same problem even with a representative sample. Specifically, causal effects estimated via multiple regression differentially weight each unit's contribution. The "effective sample" that regression uses to generate the causal effect estimate may bear little resemblance to the population of interest. The effects that multiple regression estimate may be nonrepresentative in a similar manner as are effects produced via quasi-experimental methods such as instrumental variables, matching, or regression discontinuity designs, implying there is no general external validity basis for preferring multiple regression on representative samples over quasi-experimental methods. We show how to estimate the implied multiple-regression weights for each unit, thus allowing researchers to visualize the characteristics of the effective sample. We then discuss alternative approaches that, under certain conditions, recover representative average causal effects. The requisite conditions cannot always be met.
Number of Pages in PDF File: 39
Keywords: causal inference, external validity, observational studies, effective sampleworking papers series
Date posted: July 8, 2013 ; Last revised: February 12, 2014
© 2014 Social Science Electronic Publishing, Inc. All Rights Reserved.
This page was processed by apollo1 in 0.329 seconds