Robo-advising: Learning Investors' Risk Preferences via Portfolio Choices
39 Pages Posted: 6 Nov 2019 Last revised: 18 Nov 2019
Date Written: November 16, 2019
We introduce a reinforcement learning framework for retail robo-advising. The robo-advisor does not know the investor’s risk preference, but learns it over time by observing her portfolio choices in different market environments. We develop an exploration-exploitation algorithm which trades off costly solicitations of portfolio choices by the investor with autonomous trading decisions based on stale estimates of investor’s risk aversion. We show that the algorithm’s value function converges to the optimal value function of an omniscient robo-advisor over a number of periods that is polynomial in the state and action space. By correcting for the investor’s mistakes, the robo-advisor may outperform a stand-alone investor, regardless of the investor’s opportunity cost for making portfolio decisions.
Keywords: robo-advising, reinforcement learning, portfolio selection, probably approximately correct-Markov decision processes (PAC-MDP)
JEL Classification: D14, G02, G11
Suggested Citation: Suggested Citation