Policy Learning with Α-Expected Welfare

41 Pages Posted: 21 May 2025

See all articles by Yanqin Fan

Yanqin Fan

University of Washington

Yuan Qi

University of Washington

Gaoqian Xu

University of Washington

Abstract

This paper proposes an optimal policy that targets the average welfare of the worst-off α-fraction of the post-treatment outcome distribution. We refer to this policy as theα-Expected Welfare Maximization (α-EWM) rule, where α ∈ (0, 1] denotes the size of thesubpopulation of interest. The α-EWM rule interpolates between the expected welfare(α = 1) and the Rawlsian welfare (α → 0). For α ∈ (0, 1), an α-EWM rule can beinterpreted as a distributionally robust EWM rule that allows the target population tohave a different distribution than the study population. Using the dual formulation ofour α-expected welfare function, we propose a debiased estimator for the optimal policyand establish its asymptotic upper regret bounds. In addition, we develop asymptoticallyvalid inference for the optimal welfare based on the proposed debiased estimator. Weexamine the finite sample performance of the debiased estimator and inference via bothreal and synthetic data.

Keywords: Average Value at Risk, Optimal Welfare Inference, Regret Bounds, Targeted Policy, Treatment Effects

Suggested Citation

Fan, Yanqin and Qi, Yuan and Xu, Gaoqian, Policy Learning with Α-Expected Welfare. Available at SSRN: https://ssrn.com/abstract=5263638 or http://dx.doi.org/10.2139/ssrn.5263638

Yanqin Fan (Contact Author)

University of Washington ( email )

Seattle, WA 98195
United States

Yuan Qi

University of Washington ( email )

Gaoqian Xu

University of Washington ( email )

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
4
Abstract Views
41
PlumX Metrics