affiliation not provided to SSRN
Cross-Modal Retrieval, Semi-supervised learning, curriculum learning, Contrastive Learning
1, Multi-object action modeling 2, Large vision-language models 3, Airport visual surveillance