affiliation not provided to SSRN
1, Multi-object action modeling 2, Large vision-language models 3, Airport visual surveillance