Asymptotic Optimality of Semi-Open-Loop Policies in Markov Decision Processes with Large Lead Times

24 Pages Posted: 20 Oct 2020

See all articles by Xingyu Bai

Xingyu Bai

University of Illinois at Urbana-Champaign, Industrial Enterprise and Systems Engineering

Xin Chen

University of Illinois at Urbana-Champaign

Menglong Li

University of Illinois at Urbana-Champaign

Alexander Stolyar

University of Illinois at Urbana-Champaign

Date Written: September 2, 2020

Abstract

We consider a generic Markov decision process (MDP) with two controls: one control taking effect immediately and the other control whose effect is delayed by a positive lead time. Computing the optimal policy of this MDP is difficult when the lead time is large. Interestingly, as the lead time grows, one would naturally expect that the effect of the delayed action only weakly depends on the current state, and intuitively decoupling the delayed action from the current state could provide good controls. The purpose of this paper is to substantiate this decoupling intuition for some MDPs by establishing asymptotic optimality of the semi-open-loop policies, which specify open-loop controls for the delayed action and closed-loop controls for the immediate action. Specifically, we show that for an MDP with a fast mixing property and uniformly bounded cost functions, certain periodic semi-open-loop policies are asymptotically optimal. For a classical lost-sales inventory model with divisible products, we provide an elementary proof of asymptotic optimality of constant-order policies. For the same model with indivisible products and integral order quantities, we prove that a special integral open-loop policy, referred to as bracket policy, is asymptotically optimal.

Our approach relies on a natural lower bound, provided by the optimal semi-open-loop policies for a finite-horizon problem with a horizon length equal to the original model lead time. We show that as the horizon length becomes large, the long-run average cost incurred by some specific semi-open-loop policies becomes close to the lower bound.

Keywords: open-loop policy, asymptotic analysis, Markov decision process, lead time, inventory

Suggested Citation

Bai, Xingyu and Chen, Xin and Li, Menglong and Stolyar, Alexander, Asymptotic Optimality of Semi-Open-Loop Policies in Markov Decision Processes with Large Lead Times (September 2, 2020). Available at SSRN: https://ssrn.com/abstract=3685551 or http://dx.doi.org/10.2139/ssrn.3685551

Xingyu Bai

University of Illinois at Urbana-Champaign, Industrial Enterprise and Systems Engineering ( email )

Urbana-Champaign, IL
United States
2177217269 (Phone)

Xin Chen

University of Illinois at Urbana-Champaign ( email )

601 E John St
Champaign, IL 61820
United States

Menglong Li (Contact Author)

University of Illinois at Urbana-Champaign ( email )

601 E John St
Champaign, IL 61820
United States

Alexander Stolyar

University of Illinois at Urbana-Champaign

ISE Department and Coordinated Science Lab
1308 W. Main Street, 156CSL
Urbana, IL 61801
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
127
Abstract Views
567
rank
282,154
PlumX Metrics