Lcma-Net: Light Cross-Modal Attention Network for Streamer Re-Identification in Live Video
36 Pages Posted: 23 Jun 2023
Abstract
With the rapid expansion of the we-media industry, some streamers slip inappropriate contents into the live video to attract traffic and pursue interests. As these blacklisted streamers forge identities or switch platforms to continue living, causing great harm to the Internet environment, streamer re-identification (re-id) is of paramount importance. The streamer biometrics in the live video have multimodal characteristics, including voiceprint, face, spatiotemporal information, which are complementary to each other. To this end, we propose a light cross-modal attention network (LCMA-Net) for streamer re-id in live video. Firstly, the voiceprint, face, and spatiotemporal features of the streamer are extracted by RawNet-SA, II-Net, and STDA-ResNeXt3D, respectively. Then, we design a light cross-modal pooling attention (LCMPA) module, which is combined with a multi-layer perceptron (MLP) to align and concatenate different modality features into multimodal features in LCMA-Net. Finally, the streamer is re-identified by measuring the similarity between the multimodal features. Five experiments are conducted on the StreamerReID dataset, and the results demonstrate that our method achieves competitive performance.
Keywords: live video, streamer, Re-identification, light cross-modal attention network, light cross-modal pooling attention
Suggested Citation: Suggested Citation