Advance Scheduling with Personalized Learning
Posted: 10 Mar 2020
Date Written: January 17, 2020
Joint online learning and resource allocation is a fundamental problem inherent in many applications. In this problem, an agent must allocate resources while adaptively learning the distributions of unknown parameters under delayed feedback. We introduce a general personalized framework that judiciously synergizes online learning with a broad class of online resource allocation mechanisms with uncertainty in the distributions of both rewards and resource consumption. We prove that our framework has a sub-linear Bayesian regret. As an application of our framework, we develop a contextual learning and optimization algorithm called the Personalized Scheduling while Learning with Delay (PSLD) and evaluate its theoretical performance as well. The PSLD algorithm offers an appointment (server-date) in an online manner to each arriving customer based on the contextual information on the customer and servers, and the limited capacity of the system. It operates under uncertainty in both heterogeneous rewards and service times as well as adversarial arrivals. We demonstrate the practicality and efficacy of our algorithm using real clinical data from a partner health system. Our results show that the proposed online algorithm provides promising results compared to other algorithms and outperforms the pervasive First-Come-First-Served policy by a large margin.
Keywords: advance scheduling, online learning, contextual bandit; online resource allocation; regret analysis; personalized healthcare services
Suggested Citation: Suggested Citation