Learning the Best Price and Ordering Policy under Fixed Costs and Ambiguous Demand
60 Pages Posted: 9 Apr 2020 Last revised: 18 Dec 2022
Date Written: March 13, 2020
Abstract
We study joint inventory-price control in which a firm chooses among a finite number of prices to influence the demand to be realized; also, the firm's ordering activities incur fixed setup costs. While intending to settle down on an optimal price and figure out an optimal ordering policy all catering to the long-run average criterion, the firm has ambiguity about the stationary distribution of the random demand that it is to face under each price. We propose an adaptive policy in which periods are grouped into intervals, with each being associated with one single price and one single ordering policy. Pricing is based on a learning-while-doing trade-off: a price with the least number of interval visits will be chosen when this number is below a threshold associated with the total number of interval visits under all prices; otherwise, the chosen price will be one with the most promising profit prospect estimated from past experiences. Interval-wise ordering relies on an $(s,S)$ policy most suitable for the empirical distribution learned from past experiences under the chosen price. Not only does our policy yield good regret bounds, empirically it also outperforms certain competing alternatives.
Keywords: Joint Inventory-price Control; Fixed Setup Cost; Adaptive Policy; Regret Bound; Learning while Doing
Suggested Citation: Suggested Citation