Machine Learning Instrument Variables for Causal Inference
EC '20: Proceedings of the 21st ACM Conference on Economics and Computation
46 Pages Posted: 6 Apr 2019 Last revised: 18 Jul 2023
Date Written: March 15, 2019
Abstract
Instrumental variables (IVs) are a commonly used technique for causal inference from observational data. In practice, the variation induced by IVs can be limited, which yields imprecise or biased estimates of causal effects and renders the approach ineffective for policy decisions. We confront this challenge by formulating the problem of constructing instrumental variables from candidate exogenous data as a learning problem. We provide formal asymptotic theory and show root-n consistency and asymptotic efficiency of our estimators hold under very general conditions. We show that for linear models with homoskedasticity, this translates to a standard learning problem with cross-fitting. Simulations and application to real-world data demonstrate that the algorithm is highly effective and significantly improves the performance of instrumental variable estimators from observational data. Finally, we look at recent research that critiqued the use of political cycles as an instrument for advertising. Specifically, the authors test the strength of the first stage category-by-category for 274 product categories. The authors find that for most categories, the first-stage F-statistics are less than 10 (221 of 274 product categories) in their benchmark. We demonstrate most of the issues found by the authors can be resolved using MLIVs.
Keywords: Econometrics, Machine Learning, Causal Inference, Empirical Industrial Organization
Suggested Citation: Suggested Citation