Price Dynamics on Amazon Marketplace: A Multivariate Random Forest Variable Selection Approach
42 Pages Posted: 5 Feb 2020 Last revised: 24 Feb 2022
Date Written: January 13, 2022
Abstract
On Amazon Marketplace, Amazon is both a seller and a platform host to third-party (3P) sellers. Business press and regulators are increasingly scrutinizing Amazon’s power in its Marketplace. Especially, Amazon has access to data on sales and prices of each seller, whereas 3P sellers only observe prices and their own sales. In addition, Amazon purportedly uses complex and opaque algorithmic pricing. These algorithms might be beyond the expertise and budget for 3P sellers - to understand and develop their own in response, especially without data on rival sales. We develop a forecasting tool for price changes on Amazon, based solely on publicly scrapable data. We first develop a variable selection algorithm using multivariate random forests (MVRF) to identify key predictors of a multivariate outcome. We demonstrate its robustness in recovering key variables in simulated data. We use this variable selection algorithm to identify key predictors of price change of Amazon and 3P sellers in five product categories. We then use these selected variables in a generalized additive regression model and demonstrate stronger forecasting ability than those from extant random forests, XGBoost and LASSO. While all these methods work for high-dimensional data, our simulations show that the MVRF variable selection algorithm outperforms with sparser outcome data. The variables selected by our algorithm reveal Marketplace patterns that might be of interest to manufacturers, 3P sellers and regulators.
Keywords: forecasting price dynamics, 3P sellers, Amazon marketplace, variable selection, multivariate random forests, machine learning.
JEL Classification: C14, C32, C38, C53, C63
Suggested Citation: Suggested Citation