An Online Reinforcement Learning Approach to Charging and Order-Dispatching Optimization for An E-hailing Electric Vehicle Fleet

Yan, Pengyu; Yu, Kaize; Chao, Xiuli; Chen, Zhibin

doi:10.2139/ssrn.4146312

Download This Paper

Open PDF in Browser

Add Paper to My Library

An Online Reinforcement Learning Approach to Charging and Order-Dispatching Optimization for An E-hailing Electric Vehicle Fleet

30 Pages Posted: 30 Jun 2022

See all articles by Pengyu Yan

Pengyu Yan

University of Electronic Science and Technology of China (UESTC)

Kaize Yu

University of Electronic Science and Technology of China (UESTC)

Xiuli Chao

University of Michigan at Ann Arbor - Department of Industrial and Operations Engineering

Zhibin Chen

Division of Engineering and Computer Science, NYU Shanghai; Center for Data Science and Artificial Intelligence, NYU Shanghai; Shanghai Key Laboratory of Urban Design and Urban Science

Date Written: June 14, 2022

Abstract

Given the uncertainty of orders and the dynamically changing workload of charging stations, how to dispatch and charge electric vehicle (EV) fleets becomes a significant challenge facing e-hailing platforms. The common practice is to dispatch EVs to serve orders by heuristic matching methods but enable EV drivers to independently make charging decisions based on their experience. However, such a remedy may be non-optimal and thus compromise the platform's performance. This study proposes a Markov decision process (MDP) to jointly optimize the charging and dispatching schemes for an e-hailing EV fleet, which provides exclusive pick-up services for passengers at a transportation hub. The objective is to maximize the total revenue of the fleet throughout a finite horizon. The complete state transition equations of the EV fleet are formally formulated regarding EV's reusable feature and changing states of EV batteries. Due to the curse of dimensionality, the proposed MDP is computationally intractable by dynamic programming (DP), so an online approximation algorithm is developed, which integrates the model-based reinforcement learning (RL) framework with a novel SARSA(Δ)-sample average approximation (SAA) architecture. Compared with the model-free RL algorithm and approximation DP, our algorithm explores high-quality decisions by an SAA model with empirical state transitions and meanwhile exploits the best decisions so far by an SARSA(Δ) sample-trajectory updating. Computational results based on a real case show that, in comparison with the existing heuristic method and the approximation DP in the literature, the proposed approach increases the daily revenue by an average of 31.76% and 14.22%, respectively.

Keywords: Transportation, Electric vehicle, Charging and dispatching decision, Reinforcement learning, Markov decision process

JEL Classification: L91

Suggested Citation: Suggested Citation

Yan, Pengyu and Yu, Kaize and Chao, Xiuli and Chen, Zhibin, An Online Reinforcement Learning Approach to Charging and Order-Dispatching Optimization for An E-hailing Electric Vehicle Fleet (June 14, 2022). Available at SSRN: https://ssrn.com/abstract=4146312 or http://dx.doi.org/10.2139/ssrn.4146312