Large Scale Continuous-Time Mean-Variance Portfolio Allocation via Reinforcement Learning

15 Pages Posted: 6 Aug 2019

See all articles by Haoran Wang

Haoran Wang

Columbia University - Department of Industrial Engineering and Operations Research (IEOR)

Date Written: July 23, 2019

Abstract

We propose to solve large scale Markowitz mean-variance (MV) portfolio allocation problem using reinforcement learning (RL). By adopting the recently developed continuous-time exploratory control framework, we formulate the exploratory MV problem in high dimensions. We further show the optimality of a multivariate Gaussian feedback policy, with time-decaying variance, in trading off exploration and exploitation. Based on a provable policy improvement theorem, we devise a scalable and data-efficient RL algorithm and conduct large scale empirical tests using data from the S&P 500 stocks. We found that our method consistently achieves over 10% annualized returns and it outperforms econometric methods and the deep RL method by large margins, for both long and medium terms of investment with monthly and daily trading.

Keywords: reinforcement learning, mean-variance portfolio selection, entropy regularization, stochastic control, Gaussian exploration, policy improvement theorem, high dimensional portfolio allocation

JEL Classification: C16, C63, G11, C02, C45

Suggested Citation

Wang, Haoran, Large Scale Continuous-Time Mean-Variance Portfolio Allocation via Reinforcement Learning (July 23, 2019). Available at SSRN: https://ssrn.com/abstract=3428125 or http://dx.doi.org/10.2139/ssrn.3428125

Haoran Wang (Contact Author)

Columbia University - Department of Industrial Engineering and Operations Research (IEOR) ( email )

331 S.W. Mudd Building
500 West 120th Street
New York, NY 10027
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
262
Abstract Views
1,275
Rank
187,504
PlumX Metrics