Deep Bellman Hedging

17 Pages Posted: 13 Jul 2022 Last revised: 18 Jul 2022

See all articles by Hans Buehler

Hans Buehler

JP Morgan

Murray Phillip

J.P. Morgan Chase & Co.

Ben Wood

JP Morgan Chase

Date Written: June 30, 2022


We present an actor-critic-type reinforcement learning algorithm for solving the problem of hedging a portfolio of financial instruments such as securities and over-the-counter derivatives using purely historic data. The key characteristics of our approach are: he ability to hedge with derivatives such as forwards, swaps, futures, options; incorporation of trading frictions such as trading cost and liquidity constraints; applicability for any reasonable portfolio of financial instruments; realistic, continuous state and action spaces; and formal risk-adjusted return objectives.

Most importantly, the trained model provides an optimal hedge for arbitrary initial portfolios and market states without the need for re-training.

We also prove existence of finite solutions to our Bellman equation, and show the relation to our vanilla Deep Hedging approach

Keywords: Deep Hedging, Reinforcement Learning, Convex Risk Measures, Hedging

Suggested Citation

Buehler, Hans and Phillip, Murray and Wood, Ben, Deep Bellman Hedging (June 30, 2022). Available at SSRN: or

Hans Buehler (Contact Author)

JP Morgan ( email )

4/F, 25 Bank Street
London, E14 5JP
United Kingdom

Murray Phillip

J.P. Morgan Chase & Co. ( email )

60 Wall St.
New York, NY 10260
United States

Ben Wood

JP Morgan Chase ( email )

United Kingdom

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Abstract Views
PlumX Metrics