Deep Execution - Value and Policy Based Reinforcement Learning for Trading and Beating Market Benchmarks

32 Pages Posted: 21 May 2019

See all articles by Kevin Dabérius

Kevin Dabérius

Linkoping University - Department of Computer and Information Science (IDA)

Elvin Granat

Linkoping University - Department of Computer and Information Science (IDA)

Patrik Karlsson

drkarlsson.com

Date Written: April 21, 2019

Abstract

In this article we introduce the term "Deep Execution" that utilize deep reinforcement learning (DRL) for optimal execution. We demonstrate two different approaches to solve for the optimal execution: (1) the deep double Q-network (DDQN), a value-based approach and (2) the proximal policy optimization (PPO) a policy-based approach, for trading and beating market benchmarks, such as the time-weighted average price (TWAP). We show that, firstly, the DRL can reach the theoretically derived optimum by acting on the environment directly. Secondly, the DRL agents can learn to capitalize on price trends (alpha signals) without directly observing the price. Finally, the DRL can take advantage of the available information to create dynamic strategies as an informed trader and thus outperform static benchmark strategies such as the TWAP.

Keywords: Algorithmic Trading, Deep Learning, Execution Algorithms, Reinforcement Learning, Optimal Execution

JEL Classification: C00

Suggested Citation

Dabérius, Kevin and Granat, Elvin and Karlsson, Patrik, Deep Execution - Value and Policy Based Reinforcement Learning for Trading and Beating Market Benchmarks (April 21, 2019). Available at SSRN: https://ssrn.com/abstract=3374766 or http://dx.doi.org/10.2139/ssrn.3374766

Kevin Dabérius

Linkoping University - Department of Computer and Information Science (IDA) ( email )

Linkoping, 58183
Sweden

Elvin Granat

Linkoping University - Department of Computer and Information Science (IDA) ( email )

Linköping, 581 83
Sweden

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
690
Abstract Views
2,334
rank
46,964
PlumX Metrics