Asymptotic Optimality of Semi-Open-Loop Policies in Markov Decision Processes with Large Lead Times

24 Pages Posted: 20 Oct 2020

See all articles by Xingyu Bai

Xingyu Bai

University of Illinois at Urbana-Champaign, Industrial Enterprise and Systems Engineering

Xin Chen

Georgia Institute of Technology Atlanta, USA

Menglong Li

City University of Hong Kong (CityU) - Department of Management Sciences

Alexander Stolyar

University of Illinois at Urbana-Champaign

Date Written: September 2, 2020

Abstract

We consider a generic Markov decision process (MDP) with two controls: one control taking effect immediately and the other control whose effect is delayed by a positive lead time. Computing the optimal policy of this MDP is difficult when the lead time is large. Interestingly, as the lead time grows, one would naturally expect that the effect of the delayed action only weakly depends on the current state, and intuitively decoupling the delayed action from the current state could provide good controls. The purpose of this paper is to substantiate this decoupling intuition for some MDPs by establishing asymptotic optimality of the semi-open-loop policies, which specify open-loop controls for the delayed action and closed-loop controls for the immediate action. Specifically, we show that for an MDP with a fast mixing property and uniformly bounded cost functions, certain periodic semi-open-loop policies are asymptotically optimal. For a classical lost-sales inventory model with divisible products, we provide an elementary proof of asymptotic optimality of constant-order policies. For the same model with indivisible products and integral order quantities, we prove that a special integral open-loop policy, referred to as bracket policy, is asymptotically optimal.

Our approach relies on a natural lower bound, provided by the optimal semi-open-loop policies for a finite-horizon problem with a horizon length equal to the original model lead time. We show that as the horizon length becomes large, the long-run average cost incurred by some specific semi-open-loop policies becomes close to the lower bound.

Keywords: open-loop policy, asymptotic analysis, Markov decision process, lead time, inventory

Suggested Citation

Bai, Xingyu and Chen, Xin and Li, Menglong and Stolyar, Alexander, Asymptotic Optimality of Semi-Open-Loop Policies in Markov Decision Processes with Large Lead Times (September 2, 2020). Available at SSRN: https://ssrn.com/abstract=3685551 or http://dx.doi.org/10.2139/ssrn.3685551

Xingyu Bai

University of Illinois at Urbana-Champaign, Industrial Enterprise and Systems Engineering ( email )

Urbana-Champaign, IL
United States
2177217269 (Phone)

Xin Chen

Georgia Institute of Technology Atlanta, USA ( email )

Menglong Li (Contact Author)

City University of Hong Kong (CityU) - Department of Management Sciences ( email )

Tat Chee Avenue
Kowloon Tong
Kowloon
Hong Kong
+852 34428578 (Phone)
+852 34420189 (Fax)

HOME PAGE: http://menglongli.com

Alexander Stolyar

University of Illinois at Urbana-Champaign

ISE Department and Coordinated Science Lab
1308 W. Main Street, 156CSL
Urbana, IL 61801
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
198
Abstract Views
844
rank
216,973
PlumX Metrics