Online Learning and Pricing for Multiple Products with Reference Price Effects

1 Pages Posted: 8 Feb 2023 Last revised: 18 Jan 2024

See all articles by Sheng Ji

Sheng Ji

School of Management, Zhejiang University

Cong Shi

University of Miami - Department of Management

Yi Yang

School of Management, Zhejiang University

Date Written: February 6, 2023

Abstract

We consider the dynamic pricing problem of a monopolist seller who sells a set of mutually substitutable products over a finite time horizon. Customer demand is sensitive to the price of each individual product and the reference price which is formed from a comparison among the prices of all products. To maximize the total expected profit, the seller needs to determine the selling price of each product and also selects a reference product (to be displayed) that affects the consumer's reference price. However, the seller initially knows neither the demand function nor the customer's reference price, but can learn them from past observations on the fly. As such, the seller faces the classical trade-off between exploration (learning the demand function and reference price) and exploitation (using what has been learned thus far to maximize revenue). We propose a rate-optimal dynamic learning-and-pricing algorithm that integrates iterative least squares estimation and bandit control techniques in a seamless fashion. We show that the cumulative regret, i.e., the expected revenue loss caused by not using the optimal policy over $T$ periods, is upper bounded by $\tilde{O}(n^2 \sqrt{T})$ where $\tilde{O}(\cdot)$ hides any logarithmic factors. We also establish the regret lower bound (for any learning policies) to be $\Omega(n^{2}\sqrt{T})$. We then generalize our analysis to a more general demand model. Our algorithm performs consistently well numerically, outperforming an exploration-exploitation benchmark. The use of price experimentation and estimation techniques could be readily applied in real retail management.

Keywords: online learning, pricing, reference price effect, multiple products, revenue management, multi-armed bandit

Suggested Citation

Ji, Sheng and Shi, Cong and Yang, Yi, Online Learning and Pricing for Multiple Products with Reference Price Effects (February 6, 2023). Available at SSRN: https://ssrn.com/abstract=4349904 or http://dx.doi.org/10.2139/ssrn.4349904

Sheng Ji

School of Management, Zhejiang University ( email )

Hangzhou, Zhejiang Province 310058
China

Cong Shi (Contact Author)

University of Miami - Department of Management ( email )

United States

HOME PAGE: http://https://congshi-research.github.io/

Yi Yang

School of Management, Zhejiang University ( email )

38 Zheda Road
Hangzhou, Zhejiang 310058
China

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
350
Abstract Views
1,102
Rank
171,468
PlumX Metrics