Stvanet: A Spatio-Temporal Visual Attention Framework with Large Kernel Attention Mechanism for Citywide Traffic Dynamics Prediction

21 Pages Posted: 22 Dec 2023

See all articles by Hongtai Yang

Hongtai Yang

Southwest Jiaotong University

Junbo Jiang

Southwest Jiaotong University

Zhan Zhao

The University of Hong Kong

Renbin Pan

Southwestern University of Finance and Economics (SWUFE)

Abstract

Enhancing the efficiency and safety of the Intelligent Transportation System requires effective modeling and prediction of citywide traffic dynamics. Most studies employ convolutional neural networks (CNNs) with a 3D convolutional structure or spatio-temporal models with self-attention mechanisms to capture the spatio-temporal information of traffic distribution. Although 3D CNNs excel at capturing local contextual information, they are computationally complex due to the large number of parameters and cannot capture long-range dependence. By contrast, although self-attention mechanisms originally designed to address challenges in natural language processing can capture long-range dependence, their application to 2D image structures requires breaking down the inherent 2D context into a 1D sequence, increasing the computational complexity and neglecting the adaptability between local contextual information and channels. Accordingly, we propose a spatio-temporal visual attention neural network (STVANet), a novel spatio-temporal visual attention 2D CNN, which integrates a unique visual attention module with a large kernel attention (LKA) mechanism and a feedforward component to capture long-range dependence and channel information in urban traffic data while preserving the 2D image structure. LKA-based spatio-temporal attention networks extract spatial and temporal features from weekly, daily, and recent hourly periods, and aggregate them with weighted consideration of external features to make predictions. Evaluation of real-world datasets demonstrates STVANet’s superiority over baseline models, showcasing its potential in citywide traffic prediction.

Keywords: Traffic Information, 2D ConvNets, Spatio-temporal Data, Large Kernel Attention, Deep Learning

Suggested Citation

Yang, Hongtai and Jiang, Junbo and Zhao, Zhan and Pan, Renbin, Stvanet: A Spatio-Temporal Visual Attention Framework with Large Kernel Attention Mechanism for Citywide Traffic Dynamics Prediction. Available at SSRN: https://ssrn.com/abstract=4673691 or http://dx.doi.org/10.2139/ssrn.4673691

Hongtai Yang

Southwest Jiaotong University ( email )

No. 111, Sec. North 1, Er-Huan Rd.
Chengdu
Chengdu, 610031
China

Junbo Jiang (Contact Author)

Southwest Jiaotong University ( email )

No. 111, Sec. North 1, Er-Huan Rd.
Chengdu
Chengdu, 610031
China

Zhan Zhao

The University of Hong Kong ( email )

Pokfulam Road
Hong Kong, HK
China

Renbin Pan

Southwestern University of Finance and Economics (SWUFE) ( email )

Chengdu
China

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
43
Abstract Views
142
PlumX Metrics