Online Learning for Pricing in On-Demand Vehicle Sharing Networks

We consider the pricing problem in on-demand vehicle sharing networks with online demand learning. Our model builds upon Benjaafar et al. 2022, where the service provider applies a static pricing policy while maintaining a balanced network. When there is no prior information available on the demand functions, the main challenge in designing an online learning algorithm is how to explore the demand functions while maintaining a balanced network. We address this challenge with an online learning algorithm adapted from the ellipsoid method. In our algorithm, the search subroutine is based on the idea of bisection and the Upper Confidence Bound, which can locate the price associated with a desired demand level for each type of trip, as characterized by the trip origin and destination, and estimate the gradient information at the price point. By carefully selecting the center of the ellipsoid for each iteration, we can ensure that the expected revenue improves and maintain a balanced network in each iteration. We prove that the regret of our learning algorithm is bounded by $\tilde O(\sqrt{T})$ given a fixed workload parameter $\Delta$. The numerical performance of the algorithm is illustrated using synthetic data. We also discuss extensions to the online learning algorithm in which the workload parameter $\Delta$ is unknown.

Keywords: vehicle sharing network, pricing, online learning, regret analysis

Suggested Citation: Suggested Citation

Benjaafar, Saif and Gao, Xiangyu and Shen, Xiaobing and Zhang, Huanan, Online Learning for Pricing in On-Demand Vehicle Sharing Networks (February 1, 2023). Available at SSRN: https://ssrn.com/abstract=4344364 or http://dx.doi.org/10.2139/ssrn.4344364