Forecasting Airport Transfer Passenger Flow Using Real-Time Data and Machine Learning

36 Pages Posted: 10 Oct 2018 Last revised: 8 May 2019

See all articles by Xiaojia Guo

Xiaojia Guo

University College London - School of Management

Yael Grushka-Cockayne

University of Virginia - Darden School of Business; Harvard University - Business School (HBS)

Bert De Reyck

UCL School of Management

Date Written: May 1, 2019


Problem definition: In collaboration with Heathrow airport, we develop a two-phased predictive system that produces forecasts of transfer passenger flows. In the first phase, the system predicts the entire distribution of transfer passengers’ connection times. In the second phase, the system samples from the distribution of individual connection times and produces distributional forecasts for the number of passengers arriving at the immigration and security areas.

Academic/Practical relevance: Airports and airlines have been challenged to improve decision-making by producing accurate forecasts in real time. Our work is the first to apply machine learning for predicting real-time distributional forecasts of journeys in an airport, using passenger level data. Better forecasts of these journeys can help optimize passenger experience and improve airport resource deployment.

Methodology: The predictive system developed is based on a regression tree combined with copula-based simulations. We generalize the tree method to predict complete distributions, moving beyond point forecasts. To derive insights from the tree, we introduce the concept of a stable tree that can be summarized by its key variables’ splits.

Results: Theoretically, we show that point forecasts of passenger flows generated by the two-phased approach are unbiased, and that distributional forecasts can be well-calibrated when adding correlation as a tuning parameter. When compared to benchmarks, our two-phased approach is shown to be more accurate in predicting both connection times and passenger flows.

Managerial implications: Our predictive system can produce accurate forecasts, frequently, and in real time. With these forecasts, an airport’s operating team can make data-driven decisions, identify late passengers and assist them to make their connections. The airport can also update its resourcing plans based on the prediction of passenger flows. Our approach can be generalized to other operations management domains, such as rail or hospitals, in which both arrival times and number of arrivals need to be accurately predicted.

Keywords: quantile forecasts; regression tree; copula; passenger flow management; data-driven operations

JEL Classification: M1

Suggested Citation

Guo, Xiaojia and Grushka-Cockayne, Yael and De Reyck, Bert, Forecasting Airport Transfer Passenger Flow Using Real-Time Data and Machine Learning (May 1, 2019). Available at SSRN: or

Xiaojia Guo

University College London - School of Management ( email )

Gower Street
London, WC1E 6BT
United Kingdom

Yael Grushka-Cockayne (Contact Author)

University of Virginia - Darden School of Business ( email )

P.O. Box 6550
Charlottesville, VA 22906-6550
United States

Harvard University - Business School (HBS) ( email )

Soldiers Field Road
Morgan 270C
Boston, MA 02163
United States

HOME PAGE: http://

Bert De Reyck

UCL School of Management ( email )

London, WC1E 6BT
United Kingdom

Register to save articles to
your library


Paper statistics

Abstract Views
PlumX Metrics