Pokfulam Road
Hong Kong, Pokfulam HK
China
The University of Hong Kong
Reinforcement learning, logistics, supply chain, Markov decision process, Q-learning, Actor-Critic methods, neural network