Conditionally Elicitable Dynamic Risk Measures for Deep Reinforcement Learning

35 Pages Posted: 13 Jul 2022

See all articles by Anthony Coache

Anthony Coache

University of Toronto - Department of Statistics

Sebastian Jaimungal

University of Toronto - Department of Statistics

Álvaro Cartea

University of Oxford; University of Oxford - Oxford-Man Institute of Quantitative Finance

Date Written: June 29, 2022

Abstract

We propose a novel framework to solve risk-sensitive reinforcement learning (RL) problems where the agent optimises time-consistent dynamic spectral risk measures. Based on the notion of conditional elicitability, our methodology constructs (strictly consistent) scoring functions that are used as penalizers in the estimation procedure. Our contribution is threefold: we (i) devise an efficient approach to estimate a class of dynamic spectral risk measures with deep neural networks, (ii) prove that these dynamic spectral risk measures may be approximated to any arbitrary accuracy using deep neural networks, and (iii) develop a risk-sensitive actor-critic algorithm that uses full episodes and does not require any additional nested transitions. We compare our conceptually improved reinforcement learning algorithm with the nested simulation approach and illustrate its performance in two settings: statistical arbitrage and portfolio allocation on both simulated and real data.

Keywords: Dynamic Risk Measures, Reinforcement Learning, Risk-Awareness, Elicitability, Consistent Scoring Functions,Time-Consistency, Actor-Critic Algorithm, Portfolio Allocation, Statistical Arbitrage

JEL Classification: C61, G11, C63, C45, C44

Suggested Citation

Coache, Anthony and Jaimungal, Sebastian and Cartea, Álvaro, Conditionally Elicitable Dynamic Risk Measures for Deep Reinforcement Learning (June 29, 2022). Available at SSRN: https://ssrn.com/abstract=4149461 or http://dx.doi.org/10.2139/ssrn.4149461

Anthony Coache (Contact Author)

University of Toronto - Department of Statistics ( email )

100 St. George St.
Toronto, Ontario M5S 3G3
Canada

HOME PAGE: http://anthonycoache.ca

Sebastian Jaimungal

University of Toronto - Department of Statistics ( email )

100 St. George St.
Toronto, Ontario M5S 3G3
Canada

HOME PAGE: http://http:/sebastian.statistics.utoronto.ca

Álvaro Cartea

University of Oxford ( email )

Mansfield Road
Oxford, Oxfordshire OX1 4AU
United Kingdom

University of Oxford - Oxford-Man Institute of Quantitative Finance ( email )

Eagle House
Walton Well Road
Oxford, Oxfordshire OX2 6ED
United Kingdom

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
250
Abstract Views
620
rank
174,288
PlumX Metrics