Robust Partially Observable Markov Decision Processes

32 Pages Posted: 19 Jun 2018

See all articles by Mohammad Rasouli

Mohammad Rasouli

Stanford University

Soroush Saghafian

Harvard University - Harvard Kennedy School (HKS)

Date Written: June 13, 2018

Abstract

In a variety of applications, decisions needs to be made dynamically after receiving imperfect observations about the state of an underlying system. Partially Observable Markov Decision Processes (POMDPs) are widely used in such applications. To use a POMDP, however, a decision-maker must have access to reliable estimations of core state and observation transition probabilities under each possible state and action pair. This is often challenging mainly due to lack of ample data, especially when some actions are not taken frequently enough in practice. This significantly limits the application of POMDPs in real-world settings. In healthcare, for example, medical tests are typically subject to false-positive and false-negative errors, and hence, the decision-maker has imperfect information about the health state of a patient. Furthermore, since some treatment options have not been recommended or explored in the past, data cannot be used to reliably estimate all the required transition probabilities regarding the health state of the patient. We introduce an extension of POMDPs, termed Robust POMDPs (RPOMDPs), which allows dynamic decision-making when there is ambiguity regarding transition probabilities. This extension enables making robust decisions by reducing the reliance on a single probabilistic model of transitions, while still allowing for imperfect state observations. We develop dynamic programming equations for solving RPOMDPs, provide a sufficient statistic and an information state, discuss ways in which their computational complexity can be reduced, and connect them to stochastic zero-sum games with imperfect private monitoring.

Keywords: Robust Dynamic Decision-Making, Ambiguity, Imperfect State Observation, Dynamic Programming, Sufficient Statistic, Information State, Stochastic Zero-Sum Games

Suggested Citation

Rasouli, Mohammad and Saghafian, Soroush, Robust Partially Observable Markov Decision Processes (June 13, 2018). HKS Working Paper No. RWP18-027, Available at SSRN: https://ssrn.com/abstract=3195310 or http://dx.doi.org/10.2139/ssrn.3195310

Mohammad Rasouli

Stanford University ( email )

Stanford, CA 94305
United States

Soroush Saghafian (Contact Author)

Harvard University - Harvard Kennedy School (HKS) ( email )

79 John F. Kennedy Street
Cambridge, MA 02138
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
132
Abstract Views
792
rank
312,257
PlumX Metrics