Conti-Fuse: A Novel Continuous Decomposition-based Fusion Framework for Infrared and Visible Images

Hui Li,Haolong Ma,Chunyang Cheng,Zhongwei Shen,Xiaoning Song,Xiao-Jun Wu
2024-11-27
Abstract:For better explore the relations of inter-modal and inner-modal, even in deep learning fusion framework, the concept of decomposition plays a crucial role. However, the previous decomposition strategies (base \& detail or low-frequency \& high-frequency) are too rough to present the common features and the unique features of source modalities, which leads to a decline in the quality of the fused images. The existing strategies treat these relations as a binary system, which may not be suitable for the complex generation task (e.g. image fusion). To address this issue, a continuous decomposition-based fusion framework (Conti-Fuse) is proposed. Conti-Fuse treats the decomposition results as few samples along the feature variation trajectory of the source images, extending this concept to a more general state to achieve continuous decomposition. This novel continuous decomposition strategy enhances the representation of complementary information of inter-modal by increasing the number of decomposition samples, thus reducing the loss of critical information. To facilitate this process, the continuous decomposition module (CDM) is introduced to decompose the input into a series continuous components. The core module of CDM, State Transformer (ST), is utilized to efficiently capture the complementary information from source modalities. Furthermore, a novel decomposition loss function is also designed which ensures the smooth progression of the decomposition process while maintaining linear growth in time complexity with respect to the number of decomposition samples. Extensive experiments demonstrate that our proposed Conti-Fuse achieves superior performance compared to the state-of-the-art fusion methods.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in the infrared and visible - light image fusion task, the existing decomposition strategies are too coarse to fully represent the common and unique features of the source modalities, resulting in a decline in the quality of the fused image. Specifically, traditional decomposition methods (such as base - and - detail or low - and - high - frequency decomposition) treat cross - modal and intra - modal relationships as binary systems when dealing with them, which may not be suitable for complex generation tasks (such as image fusion). To solve this problem, the paper proposes a fusion framework based on continuous decomposition (Conti - Fuse), which enhances the representation of cross - modal complementary information by increasing the number of decomposition samples, thereby reducing the loss of key information. ### Paper Background - **Image Fusion**: Image fusion aims to create an information - rich and visually appealing image by extracting the most significant information from different source images. - **Infrared and Visible - light Image Fusion (IVIF)**: This task requires integrating complementary information from different modalities to overcome the limitations of single - modality images. Visible - light images are rich in texture information but are easily affected by factors such as illumination changes and occlusions; infrared images perform well under extreme conditions (such as low light) but are easily affected by noise. ### Limitations of Existing Methods - **Shallow Feature - space Decomposition Method (SFID)**: It relies on manually - designed decomposition operations, lacks adaptability to source images, and has poor generalization ability. - **Deep Feature - space Decomposition Method (DFID)**: Although it uses deep neural networks for deeper - level feature decomposition, it usually roughly decomposes the original image into a few non - overlapping features, resulting in the loss of some key information in the source image. ### Proposed Method - **Conti - Fuse Framework**: - **Continuous Decomposition Module (CDM)**: It decomposes the input into a series of continuous components and efficiently captures the complementary information of the source modalities through the State Transformer. - **Continuous Decomposition Loss Function**: It ensures the smooth progress of the decomposition process while keeping the time complexity growing linearly with the number of decomposition samples. ### Main Contributions 1. **Novel Decomposition Strategy**: By densely sampling on the deep - feature change trajectories of the two modalities, rich decomposition features are obtained, effectively reducing the loss of key information in the fused image. 2. **Efficient Decomposition Loss Function**: It uses the Monte Carlo method to accelerate the calculation, reducing the time complexity of the continuous decomposition loss from quadratic to linear, improving the scalability of the model. 3. **Extensive Experimental Verification**: Through qualitative and quantitative experiments, it is proved that this method is superior to other state - of - the - art fusion methods in performance. ### Conclusion The Conti - Fuse framework proposed in the paper solves the deficiencies of existing methods in the infrared and visible - light image fusion task by introducing a continuous decomposition strategy and an efficient loss function, significantly improving the quality of the fused image.