Multi-Level Alignment for Few-Shot Temporal Action Localization

Keisham, Kanchan; Jalali, Amin; Lee, Minho

doi:10.2139/ssrn.4341961

Download This Paper

Open PDF in Browser

Add Paper to My Library

Multi-Level Alignment for Few-Shot Temporal Action Localization

27 Pages Posted: 29 Jan 2023

See all articles by Kanchan Keisham

Temporal action localization (TAL) which aims to localize actions occurring in a long untrimmed video, requires a large number of annotated training data. However, in real-life applications, it is very expensive to obtain segment-level annotations for large-scale datasets and there also exists an incomprehensible number of action classes that are not practical. To overcome this challenge, we present a novel few-shot learning method that localizes temporal action for previously unseen novel classes with only a few training samples. Unlike previous methods that do not exploit the alignment of visual information at each temporal location, we propose a novel multi-level encoder cosine-similarity alignment module that implicitly learns the spatiotemporal context alignment for long untrimmed videos. Towards this objective, our proposed method adopts an episodic-based training scheme to learn the alignment of similar video snippets between videos belonging to the same class with few training examples. At test time, this learned aligned context information is then adapted to novel unseen classes. Experimental results on two standard datasets ActivityNet1.3 and THUMOS-14 show that our proposed method outperforms other state-of-the-art methods for few-shot temporal action localization with single and multiple action instances on the ActivityNet-1.3 dataset and achieves competitive results on the THUMOS-14 dataset.

Keywords: few-shot learning, Temporal action localization, Feature Alignment, Cosine similarity

Suggested Citation: Suggested Citation

Keisham, Kanchan and Jalali, Amin and Lee, Minho, Multi-Level Alignment for Few-Shot Temporal Action Localization. Available at SSRN: https://ssrn.com/abstract=4341961 or http://dx.doi.org/10.2139/ssrn.4341961