Temporal Fairness in Learning and Earning: Price Protection Guarantee and Phase Transitions

51 Pages Posted: 15 Nov 2022 Last revised: 22 May 2023

See all articles by Qing Feng

Qing Feng

Cornell University - Operations Research and Industrial Engineering

Ruihao Zhu

Cornell University

Stefanus Jasin

University of Michigan, Stephen M. Ross School of Business

Date Written: November 2, 2022

Abstract

Motivated by the prevalence of ``price protection guarantee", which helps to promote temporal fairness in dynamic pricing, we study the impact of such policy on the design of online learning algorithm for data-driven dynamic pricing with initially unknown customer demand. Under the price protection guarantee, a customer who purchased a product in the past can receive a refund from the seller during the so-called price protection period (typically defined as a certain time window after the purchase date) in case the seller decides to lower the price. We consider a setting where a firm sells a product over a horizon of $T$ time steps. For this setting, we characterize how the value of $M$, the length of price protection period, can affect the optimal regret of the learning process. We show that the optimal regret is $\tilde{\Theta}(\sqrt{T}+\min\{M,\,T^{2/3}\})$ by first establishing a fundamental impossible regime with the novel \emph{refund-aware} regret lower bound analysis. Then, we propose LEAP, a phased exploration type algorithm for \underline{L}earning and \underline{EA}rning under \underline{P}rice Protection to match this lower bound up to logarithmic factors or even doubly logarithmic factors (when there are only two prices available to the seller). Our results reveal the surprising phase transitions of the optimal regret with respect to $M$. Specifically, when $M$ is not too large, the optimal regret has no major difference when compared to that of the classic setting with no price protection guarantee. We also show that there exists an upper limit on how much the optimal regret can deteriorate when $M$ grows large. Finally, we conduct extensive numerical experiments to show the benefit of LEAP over other heuristic methods for this problem.

Keywords: dynamic pricing, online learning, price protection, exploration-exploitation tradeoff

Suggested Citation

Feng, Qing and Zhu, Ruihao and Jasin, Stefanus, Temporal Fairness in Learning and Earning: Price Protection Guarantee and Phase Transitions (November 2, 2022). Available at SSRN: https://ssrn.com/abstract=4265182 or http://dx.doi.org/10.2139/ssrn.4265182

Qing Feng

Cornell University - Operations Research and Industrial Engineering ( email )

Ithaca, NY 14853
United States
6073199516 (Phone)

Ruihao Zhu (Contact Author)

Cornell University ( email )

Ithaca, NY 14853
United States

Stefanus Jasin

University of Michigan, Stephen M. Ross School of Business ( email )

701 Tappan Street
Ann Arbor, MI MI 48109
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
123
Abstract Views
853
Rank
363,819
PlumX Metrics