Enhancing Online Food Delivery with Transfer Points: A Decompose-Then-Optimize Approach via Hierarchical Reinforcement Learning
48 Pages Posted: 5 May 2025 Last revised: 24 May 2025
Date Written: April 01, 2025
Abstract
Online food delivery services can reduce operational costs and optimize efficiency by consolidating orders with similar origins, destinations, and time windows at intermediate transfer locations. This research investigates the complexity inherent in the online food delivery problem with transfer (OFDP-T) and assesses how optimized routes and courier assignments involving transfer locations can enhance system delivery performances. We propose a novel learning-based decompose-then-optimize framework to manage the exponentially growing problem size introduced by transfer and synchronization decisions and enable adaptive decision-making under uncertainty. This proposed decomposition framework is enabled by a seamless integration between a first-step hierarchical reinforcement learning (HRL) model and the resulting second-step model that can be solved as a linear assignment problem (LAP). Through comprehensive experiments based on real-world food delivery data, the study demonstrates that the combination of task-agnostic reward design and LAP-guided policy search significantly improve the baseline methods. Our case study shows that task-agnostic reward design and LAP-guided policy search improve baseline performance by 27.2\%, with the reward shaping alone boosting HRL by 37.5\% and LAP-guided search adding 6.9\%. Notably, even limited use of transfers can yield over 46.6\% improvement in route efficiency and a 23.2\% gain for remaining orders. This framework offers a deployable, real-time solution and actionable strategies for coordinating complex delivery operations and improving fleet utilization that will empower more sustainable and scalable food delivery systems.
Keywords: Online food delivery problem, hierarchical reinforcement learning, agnostic reward design
Suggested Citation: Suggested Citation
Zhang, Xinyuan and Luo, Qi and Qian, Xinwu, Enhancing Online Food Delivery with Transfer Points: A Decompose-Then-Optimize Approach via Hierarchical Reinforcement Learning (April 01, 2025). Available at SSRN: https://ssrn.com/abstract=5201175
Do you have a job opening that you would like to promote on SSRN?
Feedback
Feedback to SSRN