# Contextual Learning with Online Convex Optimization: Theory and Application to Chronic Diseases

Posted: 31 Dec 2019

See all articles by Esmaeil Keyvanshokooh

## Esmaeil Keyvanshokooh

University of Michigan at Ann Arbor - Department of Industrial and Operations Engineering

University of Michigan at Ann Arbor - Department of Industrial and Operations Engineering

## Cong Shi

University of Michigan at Ann Arbor - Department of Industrial and Operations Engineering

## Mark P. Van Oyen

University of Michigan at Ann Arbor

## Pooyan Kazemian

Massachusetts General Hospital; Harvard Medical School

Date Written: December 10, 2019

### Abstract

Chronic diseases are the leading cause of mortality and disability worldwide, requiring the surveillance and monitoring of each patient to assess disease progression and determine if an appropriate intervention for that individual is warranted. In many cases, it is a challenge to determine the most effective treatment. Even when a suitable treatment is identified, dosing it correctly remains a major challenge because the proper dosage depends on the individual. This involves adaptively learning a personalized disease progression control model conditional on patient-specific contextual information. We formulate this as a new contextual multi-armed bandit under a two-dimensional patient-specific control with a nested structure, which sequentially selects a personalized treatment and a corresponding dosage for that specific treatment based on contextual information of patients. With the goal of minimizing disease progression risk, we develop contextual learning and optimization algorithms that integrate the strength of contextual bandit learning with online convex optimization. Comparing with the clairvoyant optimal policy, we prove a T-period regret of $O(\sqrt(T)(d+K)\log(T))$, where d+K is the dimension of the feature vector, and it is provably tight up to a logarithmic factor. We derive some general technical results that are of independent interest. We illustrate the effectiveness of our methodology by using data on type 2 diabetes. Compared with current clinical practices and benchmark policies, our approach suggests a decrease in overall disease progression risks, and we obtain critical clinical implications. We believe that our contextual learning and optimization framework could be widely used in many other service systems and diseases.

Keywords: online learning algorithms, regret analysis, contextual bandit, stochastic sub-gradient descent, personalized medicine, disease progression, type 2 diabetes.

Suggested Citation

Keyvanshokooh, Esmaeil and Zhalechian, Mohammad and Shi, Cong and Van Oyen, Mark P. and Kazemian, Pooyan, Contextual Learning with Online Convex Optimization: Theory and Application to Chronic Diseases (December 10, 2019). Available at SSRN: https://ssrn.com/abstract=3501316

Abstract Views
384
PlumX Metrics