Dual Semantic Fusion Hashing for Multi-Label Cross-Modal Retrieval

Dual Semantic Fusion Hashing for Multi-Label Cross-Modal Retrieval

Kaiming Liu, Yunhong Gong, Yu Cao, Zhenwen Ren, Dezhong Peng, Yuan Sun

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 4569-4577. https://doi.org/10.24963/ijcai.2024/505

Cross-modal hashing (CMH) has been widely used for multi-modal retrieval tasks due to its low storage cost and fast query speed. Although existing CMH methods achieve promising performance, most of them mainly rely on coarse-grained supervision information (\ie pairwise similarity matrix) to measure the semantic similarities between all instances, ignoring the impact of multi-label distribution. To address this issue, we construct fine-grained semantic similarity to explore the cluster-level semantic relationships between multi-label data, and propose a new dual semantic fusion hashing (DSFH) for multi-label cross-modal retrieval. Specifically, we first learn the modal-specific representation and consensus hash codes, thereby merging the specificity with consistency. Then, we fuse the coarse-grained and fine-grained semantics to mine multiple-level semantic relationships, thereby enhancing hash codes discrimination. Extensive experiments on three benchmarks demonstrate the superior performance of our DSFH compared with 16 state-of-the-art methods.
Keywords:
Machine Learning: ML: Multi-modal learning
Machine Learning: ML: Multi-view learning