Policy Learning with Α-Expected Welfare
41 Pages Posted: 21 May 2025
Abstract
This paper proposes an optimal policy that targets the average welfare of the worst-off α-fraction of the post-treatment outcome distribution. We refer to this policy as theα-Expected Welfare Maximization (α-EWM) rule, where α ∈ (0, 1] denotes the size of thesubpopulation of interest. The α-EWM rule interpolates between the expected welfare(α = 1) and the Rawlsian welfare (α → 0). For α ∈ (0, 1), an α-EWM rule can beinterpreted as a distributionally robust EWM rule that allows the target population tohave a different distribution than the study population. Using the dual formulation ofour α-expected welfare function, we propose a debiased estimator for the optimal policyand establish its asymptotic upper regret bounds. In addition, we develop asymptoticallyvalid inference for the optimal welfare based on the proposed debiased estimator. Weexamine the finite sample performance of the debiased estimator and inference via bothreal and synthetic data.
Keywords: Average Value at Risk, Optimal Welfare Inference, Regret Bounds, Targeted Policy, Treatment Effects
Suggested Citation: Suggested Citation