Enhancing Multimodal Knowledge Graph Representation Learning through Triple Contrastive Learning

Yuxing Lu; Weichen Zhao; Nan Sun; Jinzhuo Wang

doi:10.24963/ijcai.2024/659

Enhancing Multimodal Knowledge Graph Representation Learning through Triple Contrastive Learning

Yuxing Lu, Weichen Zhao, Nan Sun, Jinzhuo Wang

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence

Main Track. Pages 5963-5971. https://doi.org/10.24963/ijcai.2024/659

PDF BibTeX

Multimodal knowledge graphs incorporate multimodal information rather than pure symbols, which significantly enhance the representation of knowledge graphs and their capacity to understand the world. Despite these advancements, existing multimodal fusion techniques still face significant challenges in representing modalities and fully integrating the diverse attributes of entities, particularly when dealing with more than one modality. To address this issue, this article proposes a Knowledge Graph Multimodal Representation Learning (KG-MRI) method. This method utilizes foundation models to represent different modalities and incorporates a triple contrastive learning model and a dual-phase training strategy to effectively fuse the different modalities with knowledge graph embeddings. We conducted comprehensive comparisons with several different knowledge graph embedding methods to validate the effectiveness of our KG-MRI model. Furthermore validation on a real-world Non-Alcohol Fatty Liver Disease (NAFLD) cohort demonstrated that the vector representations learned through our methodology possess enhanced representational capabilities, showing promise for broader applications in complex multimodal environments.

Keywords:

Multidisciplinary Topics and Applications: MTA: Bioinformatics

Data Mining: DM: Knowledge graphs and knowledge base completion

Machine Learning: ML: Multi-modal learning

Multidisciplinary Topics and Applications: MTA: Life sciences