An Instrumental Variable Forest Approach for Detecting Heterogeneous Treatment Effects in Observational Studies
48 Pages Posted: 2 Oct 2017 Last revised: 17 Mar 2021
Date Written: November 28, 2020
This study addresses the ubiquitous challenge of using big observational data to identify heterogeneous treatment effects. This problem arises in precision medicine, targeted marketing, personalized education, and many other environments. Identifying heterogeneous treatment effects presents several analytical challenges including high-dimensionality and endogeneity issues. We develop a new instrumental variable tree (IVT) approach that incorporates the instrumental variable (IV) method into a causal tree (CT) to correct for potential endogeneity biases that may exist in observational data. Our IVT approach partitions subjects into subgroups with similar treatment effects within subgroups and different treatment effects across subgroups. The estimated treatment effects are asymptotically consistent under a set of mild assumptions. Using simulated data, we show our approach has a better coverage rate and smaller mean-squared error than the conventional CT approach. We also demonstrate that an instrumental variable forest (IVF) constructed using IVTs has better accuracy and stratification than a generalized random forest (GRF). Finally, by applying the IVF approach to an empirical assessment of laparoscopic colectomy, we demonstrate the importance of accounting for endogeneity to make accurate comparisons of the heterogeneous effects of the treatment (teaching hospitals) and control (non-teaching hospitals) on different types of patients.
Keywords: big data analytics, causal inference, heterogeneous treatment effects, machine learning
Suggested Citation: Suggested Citation