Good prophets know when the end is near
30 Pages Posted: 25 Nov 2019 Last revised: 17 Mar 2023
Date Written: November 1, 2019
We consider a class of online decision-making problems with exchangeable actions, where in each period a controller is presented an input type drawn from some stochastic arrival process and must choose an action, and the final objective depends only on the aggregate type-action counts. Such a framework encapsulates many online stochastic variants of common optimization problems with knapsack, bin packing and generalized assignment as canonical examples. In such settings, we study a natural model-predictive control algorithm. We introduce general conditions under which this algorithm obtains uniform additive loss (independent of the horizon) compared to an optimal solution with full knowledge of arrivals. Our condition builds on the compensated coupling technique of Vera and Banerjee, providing a unified view of how uniform additive loss arises as a consequence of the geometry of the underlying decision-making problem.
Our characterization allows us to derive uniform-loss algorithms for several new settings, including the first such algorithm for online stochastic bin-packing. It also lets us study the effect of other modeling assumptions, including choice of horizon, batched decisions, and limited computation. In particular, we show that our condition is fulfilled by the above-mentioned problems when the end of the time-horizon is known sufficiently long before the end. In contrast, if at a late stage, there is still uncertainty about the end of the time horizon we show that such uniform loss guarantees are impossible to achieve. We demonstrate the performance of our algorithm via large-scale experiments on real and synthetic data.
Keywords: online stochastic decision-making, approximate dynamic programing, prophet inequalities, bin packing
Suggested Citation: Suggested Citation