Multi Model Attention Network for Video Source Camera Identification

Chi, Jiaqi; Wu, Zhuocheng; Wang, Wei; Hou, Jiayao; Wang, Bo

doi:10.2139/ssrn.5009590

Download This Paper

Open PDF in Browser

Add Paper to My Library

Multi Model Attention Network for Video Source Camera Identification

10 Pages Posted: 5 Nov 2024

See all articles by Jiaqi Chi

Wei Wang

Chinese Academy of Sciences (CAS) - Institute of Automation

With the development of smartphones and short video platform, digital video has become an important medium for information dissemination. However, the widespread distribution of videos has also brought many social issues. Video Source Camera Identification (VSCI) has emerged as a crucial component in the field of video forensics, playing an important role in combating false information and improving media credibility. Existing methods such as those based on Photo Response Non-Uniformity (PRNU) or machine learning are common solutions. However, most existing research has largely ignored an important piece of information present in videos: acoustic features. The contributions of audio and visuals to scene understanding evolve over time, and an efficient solution should be adaptive. To address this challenge, we proposed the Multi Modal Attention Network (MMAnet) to dynamically perform visual and audio fusion for VSCI. Meanwhile, we use Gated Recurrent Units (GRU) to fully utilize temporal information. We designed experiments, and our model achieved satisfactory performance on benchmark public databases (such as VISION, Daxing and QUFVD).

Keywords: Video Source Camera Identification, Visual and Audio Fusion, Multi Modal Attention Network, Gated Recurrent Units

Suggested Citation: Suggested Citation

Chi, Jiaqi and Wu, Zhuocheng and Wang, Wei and Hou, Jiayao and Wang, Bo, Multi Model Attention Network for Video Source Camera Identification. Available at SSRN: https://ssrn.com/abstract=5009590 or http://dx.doi.org/10.2139/ssrn.5009590