Pokfulam Road
Hong Kong, Pokfulam HK
China
The University of Hong Kong
Reinforcement learning, logistics, supply chain, Markov decision process, Q-learning, Actor-Critic methods, neural network
Order cancellations, Markov Decision Process, Proximal policy optimization, Food Delivery Problem