Double/Debiased Machine Learning for Treatment and Structural Parameters

73 Pages Posted: 10 Jul 2017 Last revised: 21 May 2018

See all articles by Victor Chernozhukov

Victor Chernozhukov

Massachusetts Institute of Technology (MIT) - Department of Economics; New Economic School

Denis Chetverikov

University of California, Los Angeles (UCLA) - Department of Economics

Mert Demirer

Massachusetts Institute of Technology (MIT)

Esther Duflo

Massachusetts Institute of Technology (MIT) - Department of Economics; Abdul Latif Jameel Poverty Action Lab (J-PAL); National Bureau of Economic Research (NBER); Centre for Economic Policy Research (CEPR); Bureau for Research and Economic Analysis of Development (BREAD)

Christian Hansen

University of Chicago - Booth School of Business - Econometrics and Statistics

Whitney K. Newey

Massachusetts Institute of Technology (MIT) - Department of Economics; National Bureau of Economic Research (NBER)

James Robins

Harvard University - T.H. Chan School of Public Health

Multiple version iconThere are 2 versions of this paper

Date Written: June 2017

Abstract

We revisit the classic semiparametric problem of inference on a low dimensional parameter θ_0 in the presence of high-dimensional nuisance parameters η_0. We depart from the classical setting by allowing for η_0 to be so high-dimensional that the traditional assumptions, such as Donsker properties, that limit complexity of the parameter space for this object break down. To estimate η_0, we consider the use of statistical or machine learning (ML) methods which are particularly well-suited to estimation in modern, very high-dimensional cases. ML methods perform well by employing regularization to reduce variance and trading off regularization bias with overfitting in practice. However, both regularization bias and overfitting in estimating η_0 cause a heavy bias in estimators of θ_0 that are obtained by naively plugging ML estimators of η_0 into estimating equations for θ_0. This bias results in the naive estimator failing to be N^(-1/2) consistent, where N is the sample size. We show that the impact of regularization bias and overfitting on estimation of the parameter of interest θ_0 can be removed by using two simple, yet critical, ingredients: (1) using Neyman-orthogonal moments/scores that have reduced sensitivity with respect to nuisance parameters to estimate θ_0, and (2) making use of cross-fitting which provides an efficient form of data-splitting. We call the resulting set of methods double or debiased ML (DML). We verify that DML delivers point estimators that concentrate in a N^(-1/2)-neighborhood of the true parameter values and are approximately unbiased and normally distributed, which allows construction of valid confidence statements. The generic statistical theory of DML is elementary and simultaneously relies on only weak theoretical requirements which will admit the use of a broad array of modern ML methods for estimating the nuisance parameters such as random forests, lasso, ridge, deep neural nets, boosted trees, and various hybrids and ensembles of these methods. We illustrate the general theory by applying it to provide theoretical properties of DML applied to learn the main regression parameter in a partially linear regression model, DML applied to learn the coefficient on an endogenous variable in a partially linear instrumental variables model, DML applied to learn the average treatment effect and the average treatment effect on the treated under unconfoundedness, and DML applied to learn the local average treatment effect in an instrumental variables setting. In addition to these theoretical applications, we also illustrate the use of DML in three empirical examples.

Suggested Citation

Chernozhukov, Victor and Chetverikov, Denis and Demirer, Mert and Duflo, Esther and Hansen, Christian and Newey, Whitney K. and Robins, James, Double/Debiased Machine Learning for Treatment and Structural Parameters (June 2017). NBER Working Paper No. w23564. Available at SSRN: https://ssrn.com/abstract=2999543

Victor Chernozhukov (Contact Author)

Massachusetts Institute of Technology (MIT) - Department of Economics ( email )

50 Memorial Drive
Room E52-262f
Cambridge, MA 02142
United States
617-253-4767 (Phone)
617-253-1330 (Fax)

HOME PAGE: http://www.mit.edu/~vchern/

New Economic School

100A Novaya Street
Moscow, Skolkovo 143026
Russia

Denis Chetverikov

University of California, Los Angeles (UCLA) - Department of Economics ( email )

8283 Bunche Hall
Los Angeles, CA 90095-1477
United States

Mert Demirer

Massachusetts Institute of Technology (MIT) ( email )

77 Massachusetts Avenue
50 Memorial Drive
Cambridge, MA 02139-4307
United States

Esther Duflo

Massachusetts Institute of Technology (MIT) - Department of Economics ( email )

50 Memorial Drive
Room E52-544
Cambridge, MA 02139
United States
617-258-7013 (Phone)
617-253-6915 (Fax)

Abdul Latif Jameel Poverty Action Lab (J-PAL) ( email )

Cambridge, MA
United States

HOME PAGE: http://www.povertyactionlab.org/

National Bureau of Economic Research (NBER)

1050 Massachusetts Avenue
Cambridge, MA 02138
United States

Centre for Economic Policy Research (CEPR)

London
United Kingdom

Bureau for Research and Economic Analysis of Development (BREAD) ( email )

Duke University
Durham, NC 90097
United States

Christian Hansen

University of Chicago - Booth School of Business - Econometrics and Statistics ( email )

Chicago, IL 60637
United States
773-834-1702 (Phone)

Whitney K. Newey

Massachusetts Institute of Technology (MIT) - Department of Economics ( email )

50 Memorial Drive
E52-262D
Cambridge, MA 02142
United States
617-253-6420 (Phone)

National Bureau of Economic Research (NBER) ( email )

1050 Massachusetts Avenue
Cambridge, MA 02138
United States

James Robins

Harvard University - T.H. Chan School of Public Health ( email )

Boston, MA
United States

Register to save articles to
your library

Register

Paper statistics

Downloads
18
Abstract Views
204
PlumX Metrics