Estimation and Inference of Counterfactual Cumulative Distribution Function in a High-Dimension Framework: A Distributional Oaxaca--Blinder Decomposition Application
63 Pages Posted: 29 Feb 2024 Last revised: 10 Dec 2024
Abstract
Counterfactual cumulative distribution function (CDF) estimation and inference are the foundations of the distribution effect analysis, average treatment effect, and quantile treated effect. High-dimensional covariates help justify the unconfoundedness assumption of causal inference and alleviate concerns of endogeneity that result from omitted variables. This study considers the estimation and inference of counterfactual CDF in a high-dimensional framework with application to distributional Oaxaca--Blinder decomposition. We propose two semi-parametric estimators: a double-machine learning estimator and a propensity score double debias estimator on the counterfactual CDFs. Asymptotics are derived for the proposed estimators and both are proved to be semiparametric efficient. Monte Carlo simulations show that the proposed estimators have good finite sample properties and smaller bias compared with existing methods. We apply the proposed methods to the Chinese CHIP2018 data on wage discrimination of gender and hukou status, and the US CPS 2017 data on union effects on wage distributions, which yields new insights when high-dimensional covariates are considered in the analysis.
Keywords: High dimensional Model, Counterfactual CDF, Double Machine Learning
Suggested Citation: Suggested Citation