Deep Reinforcement Learning for Online Assortment Customization: A Data-Driven Approach

45 Pages Posted: 26 Jun 2024

See all articles by Tao Li

Tao Li

Xi'an Jiaotong University (XJTU) - School of Management

Chenhao Wang

Chinese University of Hong Kong, Shenzhen

Yao Wang

Xi'an Jiaotong University (XJTU)

Shaojie Tang

University at Buffalo (SUNY) - School of Management

Ningyuan Chen

University of Toronto - Rotman School of Management

Date Written: June 19, 2024

Abstract

When a platform has limited inventory, it is important to have a variety of products available for each customer while managing the remaining stock. To maximize revenue over the long term, the assortment policy needs to take into account the complex purchasing behavior of customers whose arrival orders and preferences may be unknown. We propose a data-driven approach for dynamic assortment planning that utilizes historical customer arrivals and transaction data. To address the challenge of online assortment customization, we use a Markov Decision Process (MDP) framework and employ a model-free Deep Reinforcement Learning (DRL) approach to solve the online assortment policy because of the computational challenge. Our method uses a specially designed deep neural network (DNN) model to create assortments while observing the inventory constraints, and an Advantage Actor-Critic (A2C) algorithm to update the parameters of the DNN model, with the help of a simulator built from the historical transaction data. To evaluate the effectiveness of our approach, we conduct simulations using both a synthetic data set generated with a predetermined customer type distribution and ground-truth choice model, as well as a real-world data set. Our extensive experiments demonstrate that our approach produces significantly higher long-term revenue compared to some existing methods and remains robust under various practical conditions. We also demonstrate that our approach can be easily adapted to a more general problem that includes reusable products, where customers might return purchased items. In this setting, we find that our approach performs well under various usage time distributions.

Keywords: online assortment, customization, deep reinforcement learning, simulation, reusable products

Suggested Citation

Li, Tao and Wang, Chenhao and Wang, Yao and Tang, Shaojie and Chen, Ningyuan, Deep Reinforcement Learning for Online Assortment Customization: A Data-Driven Approach (June 19, 2024). Available at SSRN: https://ssrn.com/abstract=4870298 or http://dx.doi.org/10.2139/ssrn.4870298

Tao Li (Contact Author)

Xi'an Jiaotong University (XJTU) - School of Management ( email )

Shaanxi Province 710049
China

Chenhao Wang

Chinese University of Hong Kong, Shenzhen ( email )

2001 Longxiang Boulevard, Longgang District
Shenzhen, 518172

Yao Wang

Xi'an Jiaotong University (XJTU)

26 Xianning W Rd.
Xi'an Jiao Tong University
Xi'an, Shaanxi 710049
China

Shaojie Tang

University at Buffalo (SUNY) - School of Management ( email )

255 Jacobs Management Center
Buffalo, NY 14260
United States

Ningyuan Chen

University of Toronto - Rotman School of Management ( email )

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
241
Abstract Views
535
Rank
271,574
PlumX Metrics