Dynamic Pricing in an Evolving and Unknown Marketplace

60 Pages Posted: 6 Jun 2019 Last revised: 6 Feb 2020

See all articles by Yiwei Chen

Yiwei Chen

University of Cincinnati - Lindner College of Business

Zheng Wen

Adobe Research

Yao Xie

Georgia Institute of Technology

Date Written: May 5, 2019

Abstract

We consider a firm that sells a single type product on multiple local markets over a finite horizon via dynamically adjusted prices. To prevent price discrimination, prices posted on different local markets at the same time are the same. The entire horizon consists of one or multiple change-points. Each local market's demand function linearly evolves over time between any two consecutive change-points. Each change-point is classified as either a zero-order or a first-order change-point in terms of how smooth the demand function changes at this point. At a zero-order change-point, at least one local market's demand function has an abrupt change. At a first-order change-point, all local markets' demand functions continuously evolve over time, but at least one local market's demand evolution speed has an abrupt change. The firm has no information about any parameter that modulates the demand evolution process before the start of the horizon. The firm aims at finding a pricing policy that yields as much revenue as possible. We show that the regret under any pricing policy is lower bounded by CT^{1/2} with C>0, and the lower bound becomes as worse as CT^{2/3} if at least one change-point is a first-order change-point.

We propose a Joint Change-Point Detection and Time-adjusted Upper Confidence Bound (CU) algorithm. This algorithm consists of two components: the change-point detection component and the exploration-exploitation component. In the change-point detection component, the firm uniformly samples each price for one time in each batch of the time interval with the same length. She uses sales data collected at the times that she uniformly samples prices to both detect whether a change occurs and judge whether it is a zero-order or a first-order change if it occurs. In the exploration-exploitation component, the firm implements a time-adjusted upper confidence bound (UCB) algorithm between two consecutive detected change-points. Because demand dynamically evolves between two consecutive change-points, we introduce a time factor into the classical UCB algorithm to correct the bias of using historic sales data to estimate demand at present. We show that the CU algorithm achieves the regret lower bounds (up to logarithmic factors).

Keywords: revenue management, dynamic pricing, online learning, multi-armed bandit, change-point detection, asymptotic optimality

Suggested Citation

Chen, Yiwei and Wen, Zheng and Xie, Yao, Dynamic Pricing in an Evolving and Unknown Marketplace (May 5, 2019). Available at SSRN: https://ssrn.com/abstract=3382957 or http://dx.doi.org/10.2139/ssrn.3382957

Yiwei Chen (Contact Author)

University of Cincinnati - Lindner College of Business ( email )

P.O. Box 210195
Cincinnati, OH 45221-0195
United States

Zheng Wen

Adobe Research ( email )

321 Park Avenue
San Jose, CA 95113

Yao Xie

Georgia Institute of Technology ( email )

Atlanta, GA 30332
United States

Here is the Coronavirus
related research on SSRN

Paper statistics

Downloads
101
Abstract Views
734
rank
281,622
PlumX Metrics