Beyond IID: data-driven decision making in heterogeneous environments

86 Pages Posted: 27 Jun 2022 Last revised: 1 Jan 2025

See all articles by Omar Besbes

Omar Besbes

Columbia University - Columbia Business School, Decision Risk and Operations

Will Ma

Columbia University - Columbia Business School, Decision Risk and Operations

Omar Mouchtaki

New York University (NYU) - Leonard N. Stern School of Business

Date Written: May 26, 2022

Abstract

How should one leverage historical data when past observations are not perfectly indicative of the future, e.g., due to the presence of unobserved confounders which one cannot "correct" for? Motivated by this question, we study a data-driven decision-making framework in which historical samples are generated from unknown and different distributions assumed to lie in a heterogeneity ball with known radius and centered around the (also) unknown future (out-ofsample) distribution on which the performance of a decision will be evaluated. This work aims at analyzing the performance of central data-driven policies but also near-optimal ones in these heterogeneous environments and understanding key drivers of performance. We establish a first result which allows to upper bound the asymptotic worst-case regret of a broad class of policies. Leveraging this result, for any integral probability metric, we provide a general analysis of the performance achieved by Sample Average Approximation (SAA) as a function of the radius of the heterogeneity ball. This analysis is centered around the approximation parameter, a notion of complexity we introduce to capture how the interplay between the heterogeneity and the problem structure impacts the performance of SAA. In turn, we illustrate through several widely-studied problems-e.g., newsvendor, pricing-how this methodology can be applied and find that the performance of SAA varies considerably depending on the combinations of problem classes and heterogeneity. The failure of SAA for certain instances motivates the design of alternative policies to achieve rate-optimality. We derive problem-dependent policies achieving strong guarantees for the illustrative problems described above and provide initial results towards a principled approach for the design and analysis of general rate-optimal algorithms.

Keywords: data-driven algorithms, distribution shift, distributionally robust optimization, minimax regret, sample average approximation, pricing, newsvendor, ski-rental

JEL Classification: C02, C44, C61

Suggested Citation

Besbes, Omar and Ma, Will and Mouchtaki, Omar, Beyond IID: data-driven decision making in heterogeneous environments (May 26, 2022). Columbia Business School Research Paper No. 4140928, Available at SSRN: https://ssrn.com/abstract=4140928 or http://dx.doi.org/10.2139/ssrn.4140928

Omar Besbes

Columbia University - Columbia Business School, Decision Risk and Operations ( email )

New York, NY
United States

Will Ma

Columbia University - Columbia Business School, Decision Risk and Operations ( email )

New York, NY
United States

Omar Mouchtaki (Contact Author)

New York University (NYU) - Leonard N. Stern School of Business ( email )

44 West 4th Street
New York, NY NY 10012
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
609
Abstract Views
1,685
Rank
92,209
PlumX Metrics