General Multi-Agent Reinforcement Learning Integrating Adaptive Manoeuvre Strategy for Real-Time Multi-Aircraft Conflict Resolution
55 Pages Posted: 10 Aug 2022
Abstract
Reinforcement learning (RL) techniques are being studied to solve the conflict resolution (CR) in air traffic management (ATM) to exploit their computational performance fully and cope with flight uncertainty. Due to the limitation of generalisation, it is challenging for existing RL-based CR methods to apply in practice effectively. This paper proposes a general multi-agent reinforcement learning (MARL) method that integrates an adaptive manoeuvre strategy to improve the efficiency of the solution and the generalisation of the model in multi-aircraft conflict resolution (MACR). A partial observation approach based on imminent threats of detection sectors is applied to collect critical environmental information so that the model can be used in arbitrary scenarios. Agents are trained to learn to provide a proper flight intention (e.g., speed up and yaw to the left). An adaptive manoeuvre strategy generates the specific manoeuvre (i.e., speed and heading parameters) according to the flight intention. A warning area of each aircraft is introduced to cope with the flight uncertainty and problems arising from the non-stationarity in MARL. A state-of-the-art Deep Q-learning Network (DQN) method, Rainbow DQN, is employed to improve the efficiency of RL. The multi-agent system is trained and deployed in a distributed manner to adapt to scenarios in practice. Sensitivity analysis of uncertainty levels and warning area sizes is performed to explore their impact on the proposed method. Simulation experiments verify the effectiveness of the training and the generalisation of the proposed method. The proposed method outperforms the state-of-the-art RL-based CR methods in experiments by comparing to their publicly available data.
Keywords: Air traffic management, Multi-aircraft conflict resolution, Multi-agent reinforcement learning, Deep q-learning network, Generalisation, Uncertainty
Suggested Citation: Suggested Citation