Revisiting Approximate Linear Programming Using a Saddle Point Approach

52 Pages Posted: 13 Jun 2017 Last revised: 17 Jun 2018

Qihang Lin

University of Iowa - Henry B. Tippie College of Business

Selvaprabu Nadarajah

University of Illinois at Chicago - College of Business Administration

Negar Soheili

University of Illinois at Chicago - College of Business Administration

Date Written: June 11, 2017

Abstract

Approximate linear programs (ALPs) are well-known models for computing value function approximations (VFAs) of intractable Markov decision processes (MDPs) arising in applications. VFAs from ALPs have desirable theoretical properties, define an operating policy, and provide a lower bound on the optimal policy cost, which can be used to assess the suboptimality of heuristic policies. However, solving ALPs near-optimally remains challenging, for example, when approximating MDPs with nonlinear cost functions and transition dynamics or when rich basis functions are required to obtain a good VFA. We address this tension between theory and solvability by proposing a convex saddle-point reformulation of an ALP that includes as primal and dual variables, respectively, a vector of basis function weights and a constraint violation density function over the state-action space. To solve this reformulation, we develop a proximal stochastic mirror descent (PSMD) method. We establish that PSMD returns a near-optimal ALP solution and a lower bound on the optimal policy cost in a finite number of iterations with high probability. We numerically compare PSMD with several benchmarks on inventory control and energy storage applications. We find that the PSMD lower bound is tighter than a perfect information bound. In contrast, the constraint sampling approach to solve ALPs may not provide a lower bound and applying row generation to tackle ALPs is not computationally viable. PSMD policies outperform problem-specific heuristics and are comparable to or better than the policies obtained using constraint sampling. Overall, our ALP reformulation and solution approach broadens the applicability of approximate linear programming.

Keywords: approximate linear programming, approximate dynamic programming, Markov decision processes, first order methods, energy storage, inventory control

Suggested Citation

Lin, Qihang and Nadarajah, Selvaprabu and Soheili, Negar, Revisiting Approximate Linear Programming Using a Saddle Point Approach (June 11, 2017). Available at SSRN: https://ssrn.com/abstract=2984602 or http://dx.doi.org/10.2139/ssrn.2984602

Qihang Lin

University of Iowa - Henry B. Tippie College of Business ( email )

Acquisitions
21 East Market Street
Iowa City, IA 52242-1000
United States

Selvaprabu Nadarajah (Contact Author)

University of Illinois at Chicago - College of Business Administration ( email )

601 South Morgan Street
Chicago, IL 60607
United States

Negar Soheili

University of Illinois at Chicago - College of Business Administration ( email )

601 South Morgan Street
Chicago, IL 60607
United States

Register to save articles to
your library

Register

Paper statistics

Downloads
94
rank
251,910
Abstract Views
252
PlumX