Time Consistent Reinforcement Learning for Optimal Consumption Under Epstein-Zin Preferences
34 Pages Posted: 20 Mar 2023
Date Written: March 14, 2023
We present a class of least squares reinforcement learning algorithms for optimal consumption under elasticity of intertemporal substitution and risk aversion preferences. The classical setting of Epstein-Zin utility preferences is cast into a dynamic utility functional framework and shown to exhibit time consistency. As a dynamic utility function, we find the robust approximation of the optimal consumption problem as a discrete time Markov Decision Process. We present a least-squares Q-Learning algorithm suitable for non-linear monotone certainty equivalents and benchmark its policy estimation convergence properties on an optimal wealth consumption problem against Least Squares Monte-Carlo and binomial tree methods. Finally, we demonstrate our least-squares Q-learning algorithm on an optimal consumption problem applied to SPDR S&P 500 ETF Trust (SPY) data.
Keywords: Optimal Consumption, Dynamic Utility Theory, Certainty Equivalents, Reinforcement Learning, Time consistency, Epstein-Zin, Wealth Management
Suggested Citation: Suggested Citation