Large Scale Continuous-Time Mean-Variance Portfolio Allocation via Reinforcement Learning

15 Pages Posted: 6 Aug 2019

See all articles by Haoran Wang

Haoran Wang

Columbia University - Department of Industrial Engineering and Operations Research (IEOR)

Date Written: July 23, 2019

Abstract

We propose to solve large scale Markowitz mean-variance (MV) portfolio allocation problem using reinforcement learning (RL). By adopting the recently developed continuous-time exploratory control framework, we formulate the exploratory MV problem in high dimensions. We further show the optimality of a multivariate Gaussian feedback policy, with time-decaying variance, in trading off exploration and exploitation. Based on a provable policy improvement theorem, we devise a scalable and data-efficient RL algorithm and conduct large scale empirical tests using data from the S&P 500 stocks. We found that our method consistently achieves over 10% annualized returns and it outperforms econometric methods and the deep RL method by large margins, for both long and medium terms of investment with monthly and daily trading.

Keywords: reinforcement learning, mean-variance portfolio selection, entropy regularization, stochastic control, Gaussian exploration, policy improvement theorem, high dimensional portfolio allocation

JEL Classification: C16, C63, G11, C02, C45

Suggested Citation

Wang, Haoran, Large Scale Continuous-Time Mean-Variance Portfolio Allocation via Reinforcement Learning (July 23, 2019). Available at SSRN: https://ssrn.com/abstract=3428125 or http://dx.doi.org/10.2139/ssrn.3428125

Haoran Wang (Contact Author)

Columbia University - Department of Industrial Engineering and Operations Research (IEOR) ( email )

331 S.W. Mudd Building
500 West 120th Street
New York, NY 10027
United States

Here is the Coronavirus
related research on SSRN

Paper statistics

Downloads
138
Abstract Views
859
rank
239,726
PlumX Metrics