Collaborative Learning and Decision-Making on Pricing and Recommendation: A Simple Framework for Planning
42 Pages Posted: 22 Nov 2022 Last revised: 30 Jan 2023
Date Written: November 1, 2022
We formulate a collaborative learning and decision-making problem involving contextual information. In current business practices, pricing and recommendation decisions often are made jointly by multiple teams in sequence. The decision-making processes for different teams can be controlled by either a centralized or decentralized planner. We propose a simple collaboration framework that integrates the learning about decision-making in an unknown environment. The main challenge in a decentralized framework is that the decision-making process in other teams is unknown, but the subsequent decisions are mutually dependent. From practical concern of high exploring cost and implementation complexity, we propose a simple greedy algorithm for the centralized planner and a "greedy" + "weighted sampling" (GWS) algorithm for both the centralized and decentralized planner to balance the learning and earning. We surprisingly show that the exploration-free greedy algorithm can achieve the optimal rate when context diversity holds. The GWS algorithm works effectively for either centralized or decentralized planners under a much weaker condition, which we call context variation. Furthermore, we extend our framework to the multi-product pricing and ranking problem and study the model misspecification issue. We test our algorithm using real data from JD.com, a large e-commerce retailer. Numerical studies validate the superior performance of the two proposed frameworks for different types of planners.
Keywords: data-driven decision-making; contextual bandit; centralized and decentralized planners; greedy algorithm; collaborative decision-making
Suggested Citation: Suggested Citation