Cross-Modal Hashing Retrieval with Compatible Triplet Representation

Yan, Xueming; Hao, Zhifeng; Jin, Yaochu; Wang, Chuyue; Yang, ShangShang; Ge, Hong

doi:10.2139/ssrn.4810050

Download This Paper

Open PDF in Browser

Add Paper to My Library

Cross-Modal Hashing Retrieval with Compatible Triplet Representation

14 Pages Posted: 28 Apr 2024

See all articles by Xueming Yan

Cross-modal hashing retrieval has emerged as a promising approach due to its advantages in storage efficiency and query speed for handling diverse multimodal data. However, existing cross-modal hashing retrieval methods often oversimplify similarity by solely considering identical labels across modalities and are sensitive to noise in the original multimodal data. To tackle this challenge, we propose a cross-modal hashing retrieval approach with compatible triplet representation. In the proposed approach, we integrate the essential feature representations and semantic information from text and images into their corresponding multi-label feature representations, and introduce a fusion attention module to extract text and image modalities with channel and spatial attention features, respectively, thereby enhancing compatible triplet-based semantic information in cross-modal hashing learning. Comprehensive experiments demonstrate the superiority of the proposed approach in retrieval accuracy compared to state-of-the-art methods on three public datasets.

Keywords: Cross-modal hashing retrieval, Compatible triplet, Label network, Fusion attention

Suggested Citation: Suggested Citation

Yan, Xueming and Hao, Zhifeng and Jin, Yaochu and Wang, Chuyue and Yang, ShangShang and Ge, Hong, Cross-Modal Hashing Retrieval with Compatible Triplet Representation. Available at SSRN: https://ssrn.com/abstract=4810050 or http://dx.doi.org/10.2139/ssrn.4810050