Scalable Deep Reinforcement Learning in the Non-Stationary Capacitated Lot Sizing Problem
23 Pages Posted: 28 May 2024
Abstract
Capacitated lot sizing problems in situations with stationary and non-stationary demand (SCLSP) are very common in practice. Solving problems with a large number of items using Deep Reinforcement Learning (DRL) is challenging due to the large action space. This paper proposes a new Markov Decision Process (MDP) formulation to solve this problem, by decomposing the production quantity decisions in a period into sub-decisions, which reduces the action space dramatically. We demonstrate that applying Deep Controlled Learning (DCL) yields policies that outperform the benchmark heuristic as well as a prior DRL implementation. By using the decomposed MDP formulation and DCL method outlined in this paper, we can solve larger problems compared to the previous DRL implementation. Moreover, we adopt a non-stationary demand model for training the policy, which enables us to readily apply the trained policy in dynamic environments when demand changes.
Keywords: deep reinforcement learning, capacitated lot sizing, non-stationary demand
Suggested Citation: Suggested Citation