Negative Prompt Driven Complementary Parallel Representation for Open-World 3D Object Retrieval

Negative Prompt Driven Complementary Parallel Representation for Open-World 3D Object Retrieval

Yang Xu, Yifan Feng, Yue Gao

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 1498-1506. https://doi.org/10.24963/ijcai.2024/166

The limited availability of supervised labels (positive information) poses a notable challenge for open-world retrieval. However, negative information is more easily obtained but remains underexploited in current methods. In this paper, we introduce the Negative Prompt Driven Complementary Parallel Representation (NPCP) framework, which navigates the complexities of open-world retrieval through the lens of Negative Prompts. Specifically, we employ the Parallel Exclusive Embedding (PEE) to effectively utilize the prompt information, bilaterally capturing both explicit negative and implicit positive signals. To address the challenges of embedding unification and generalization, our method leverages high-order correlations among objects through the Complementary Structure Tuning (CST), by constructing a complementary hypergraph based on bi-directional and cross-category correlations. We have developed four multimodal datasets for open-world 3D object retrieval with negative prompts: NPMN, NPAB, NPNT, and NPES. Extensive experiments and ablation studies on these four benchmarks demonstrate the superiority of our method over current state-of-the-art approaches.
Keywords:
Computer Vision: CV: 3D computer vision
Computer Vision: CV: Image and video retrieval 
Computer Vision: CV: Representation learning