Where Elegance Meets Precision: Towards a Compact, Automatic, and Flexible Framework for Multi-modality Image Fusion and Applications

Where Elegance Meets Precision: Towards a Compact, Automatic, and Flexible Framework for Multi-modality Image Fusion and Applications

Jinyuan Liu, Guanyao Wu, Zhu Liu, Long Ma, Risheng Liu, Xin Fan

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 1110-1118. https://doi.org/10.24963/ijcai.2024/123

Multi-modality image fusion aims to integrate images from multiple sensors, producing an image that is visually appealing and offers more comprehensive information than any single one. To ensure high visual quality and facilitate accurate subsequent perception tasks, previous methods have often cascaded networks using weighted loss functions. However, such simplistic strategies struggle to truly achieve the "Best of Both Worlds", and the adjustment of numerous hand-crafted parameters becomes burdensome. To address these challenges, this paper introduces a Compact, Automatic and Flexible framework, dubbed CAF, designed for infrared and visible image fusion, along with subsequent tasks. Concretely, we recast the combined problem of fusion and perception into a single objective, allowing mutual optimization of information from both tasks. Then we also utilize the perception task to inform the design of fusion loss functions, facilitating the automatic identification of optimal fusion objectives tailored to the task. Furthermore, CAF can support seamless integration with existing approaches easily, offering flexibility in adapting to various tasks and network structures. Extensive experiments demonstrate the superiority of CAF, which not only produces visually admirable fused results but also realizes 1.7 higher detection mAP@.5 and 2.0 higher segmentation mIoU than the state-of-the-art methods. The code is available at https://github.com/RollingPlain/CAF_IVIF.
Keywords:
Computer Vision: CV: Applications
Computer Vision: CV: Multimodal learning