38 Zheda Road
Hangzhou, 310058
China
Zhejiang University
Distributional reinforcement learning, Actor critic, Push-forward policy, Sample-based regularizer