Unsupervised Misaligned Infrared and Visible Image Fusion via Cross-Modality Image Generation and Registration
Unsupervised Misaligned Infrared and Visible Image Fusion via Cross-Modality Image Generation and Registration
Di Wang, Jinyuan Liu, Xin Fan, Risheng Liu
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
Main Track. Pages 3508-3515.
https://doi.org/10.24963/ijcai.2022/487
Recent learning-based image fusion methods have marked numerous progress in pre-registered multi-modality data, but suffered serious ghosts dealing with misaligned multi-modality data, due to the spatial deformation and the difficulty narrowing cross-modality discrepancy.
To overcome the obstacles, in this paper, we present a robust cross-modality generation-registration paradigm for unsupervised misaligned infrared and visible image fusion (IVIF).
Specifically, we propose a Cross-modality Perceptual Style Transfer Network (CPSTN) to generate a pseudo infrared image taking a visible image as input.
Benefiting from the favorable geometry preservation ability of the CPSTN, the generated pseudo infrared image embraces a sharp structure, which is more conducive to transforming cross-modality image alignment into mono-modality registration coupled with the structure-sensitive of the infrared image.
In this case, we introduce a Multi-level Refinement Registration Network (MRRN) to predict the displacement vector field between distorted and pseudo infrared images and reconstruct registered infrared image under the mono-modality setting.
Moreover, to better fuse the registered infrared images and visible images, we present a feature Interaction Fusion Module (IFM) to adaptively select more meaningful features for fusion in the Dual-path Interaction Fusion Network (DIFN).
Extensive experimental results suggest that the proposed method performs superior capability on misaligned cross-modality image fusion.
Keywords:
Machine Learning: Multi-modal learning
Machine Learning: Unsupervised Learning