affiliation not provided to SSRN
Visual object tracking, Transformer, Spatio-temporal modeling, graph attention network, Trajectory-aware network, Video representation learning