Safe Reinforcement Learning with Contextual Information: Theory and Applications

34 Pages Posted: 2 Oct 2023

See all articles by Junyu Cao

Junyu Cao

University of Texas at Austin - McCombs School of Business

Esmaeil Keyvanshokooh

Mays Business School, Texas A&M University

Tian Liu

Texas A & M University

Date Written: September 25, 2023

Abstract

Motivated by a critical medical decision-making problem, we study a sequential decision-making setting of learning a personalized, safe control policy that maximizes an objective function subject to safety constraints that need to be satisfied during the learning process. We formulate this setting as a contextual constrained Markov decision process model with unknown transition probabilities, reward, and constraint functions. We develop a practical and intuitive reinforcement learning (RL) algorithm that accounts for (i) personalization, (ii) safety guarantees, and (iii) general statistical models for handling uncertainty. We conduct a rigorous regret analysis of this framework by seamlessly synthesizing RL theory with statistical machine learning and optimization techniques, proving that it admits a sub-linear regret without violating safety constraints during the learning phase. Our analysis reveals a significant regret-bound improvement compared to existing theoretical results in both safe and contextual RL. To validate our theoretical findings, we use both synthetic data and a granular clinical dataset of patients with co-morbid type 2 diabetes and hypertension, who are at elevated risk for atherosclerotic cardiovascular diseases. Through extensive analyses, we highlight the superiority of our methodology over benchmark policies and current practices. Our work possesses versatile applicability across various domains where safety and personalization matter.

Note:

Funding Information: There is no funding for this research.

Conflict of Interests: The authors acknowledge there are no conflicts of interest.

Ethical Approval: The data used in this research was approved under IRB 2022-1303 by Texas A&M University.

Keywords: safe reinforcement learning, personalized medicine, regret analysis, contextual optimization

Suggested Citation

Cao, Junyu and Keyvanshokooh, Esmaeil and Liu, Tian, Safe Reinforcement Learning with Contextual Information: Theory and Applications (September 25, 2023). Available at SSRN: https://ssrn.com/abstract=4583667 or http://dx.doi.org/10.2139/ssrn.4583667

Junyu Cao (Contact Author)

University of Texas at Austin - McCombs School of Business ( email )

Austin, TX
United States

Esmaeil Keyvanshokooh

Mays Business School, Texas A&M University ( email )

430 Wehner
College Station, TX 77843
United States

HOME PAGE: http://ekshokooh.github.io/

Tian Liu

Texas A & M University ( email )

College Station, TX
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
163
Abstract Views
739
Rank
352,287
PlumX Metrics