Multi-Armed Bandits with Endogenous Learning Curves: An Application to Split Liver Transplantation

49 Pages Posted: 28 May 2021 Last revised: 11 Oct 2023

See all articles by Yanhan (Savannah) Tang

Yanhan (Savannah) Tang

Carnegie Mellon University, David A. Tepper School of Business, Students

Andrew Li

Carnegie Mellon University

Alan Andrew Scheller-Wolf

Carnegie Mellon University

Sridhar R. Tayur

Carnegie Mellon University - David A. Tepper School of Business

Date Written: April 13, 2021

Abstract

Problem definition: Proficiency in many sophisticated tasks is attained through experience-based learning, in other words, learning by doing. For example, transplant surgeons need to practice difficult surgeries to master the skills required, call center staff need to handle customer calls to improve their ability to resolve customer issues, and new franchisees learn to operate smoothly over time. This experience-based learning may affect other stakeholders, for example, patients eligible for transplant surgeries. Such a situation illustrates the classical exploration versus exploitation trade-off: A central planner may want to identify and develop surgeons with high aptitudes, while ensuring that patients still have excellent outcomes and equitable access to organs. Methodology and Results: We formulate a multi-armed bandit (MAB) model, in which parametric learning curves are embedded in the reward functions to capture endogenous, experience-based learning. In addition, our model includes provisions ensuring that the choices of arms are subject to fairness constraints to ensure equity. To solve our MAB problem we propose the L-UCB and FL-UCB algorithms, variants of the upper confidence bound (UCB) algorithm that attain O(log t) regret on problems enhanced with experience-based learning and fairness concerns. We demonstrate our model and algorithms on the split liver transplantation (SLT) allocation problem, showing that our algorithms have superior numerical performance compared to standard bandit algorithms in a setting where experience-based learning and fairness concerns exist. Managerial implications: From a methodological point of view, our proposed MAB model and algorithms are generic and have broad application prospects. From an application standpoint, our algorithms could be applied to help evaluate potential strategies to increase the proliferation of SLT and other technically-difficult medical procedures.

Keywords: Multi-armed bandit, upper confidence bound algorithms, learning, split liver transplantation, fairness

Suggested Citation

Tang, Yanhan and Li, Andrew and Scheller-Wolf, Alan Andrew and Tayur, Sridhar R., Multi-Armed Bandits with Endogenous Learning Curves: An Application to Split Liver Transplantation (April 13, 2021). Available at SSRN: https://ssrn.com/abstract=3855206 or http://dx.doi.org/10.2139/ssrn.3855206

Yanhan Tang (Contact Author)

Carnegie Mellon University, David A. Tepper School of Business, Students ( email )

Pittsburgh, PA
United States

Andrew Li

Carnegie Mellon University ( email )

5000 Forbes Avenue
Pittsburgh, PA 15213-3890
United States

Alan Andrew Scheller-Wolf

Carnegie Mellon University ( email )

Pittsburgh, PA 15213-3890
United States

Sridhar R. Tayur

Carnegie Mellon University - David A. Tepper School of Business ( email )

5000 Forbes Avenue
Pittsburgh, PA 15213-3890
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
354
Abstract Views
1,597
Rank
156,172
PlumX Metrics