Abstract:Image fusion is famous as an alternative solution to generate one high-quality image from multiple images in addition to image restoration from a single degraded image. The essence of image fusion is to integrate complementary information from source images. Existing fusion methods struggle with generalization across various tasks and often require labor-intensive designs, in which it is difficult to identify and extract useful information from source images due to the diverse requirements of each fusion task. Additionally, these methods develop highly specialized features for different downstream applications, hindering the adaptation to new and diverse downstream tasks. To address these limitations, we introduce DeFusion++, a novel framework that leverages self-supervised learning (SSL) to enhance the versatility of feature representation for different image fusion tasks. DeFusion++ captures the image fusion task-friendly representations from large-scale data in a self-supervised way, overcoming the constraints of limited fusion datasets. Specifically, we introduce two innovative pretext tasks: common and unique decomposition (CUD) and masked feature modeling (MFM). CUD decomposes source images into abstract common and unique components, while MFM refines these components into robust fused features. Jointly training of these tasks enables DeFusion++ to produce adaptable representations that can effectively extract useful information from various source images, regardless of the fusion task. The resulting fused representations are also highly adaptable for a wide range of downstream tasks, including image segmentation and object detection. DeFusion++ stands out by producing versatile fused representations that can enhance both the quality of image fusion and the effectiveness of downstream high-level vision tasks, simplifying the process with the elegant fusion framework.

Convolution–deconvolution Word Embedding: an End-to-end Multi-Prototype Fusion Embedding Method for Natural Language Processing

Enhanced Double-Carrier Word Embedding Via Phonetics and Writing

Going Beyond Multi-Task Dense Prediction with Synergy Embedding Models

Do Multi-Sense Embeddings Improve Natural Language Understanding?

Improve Word Embedding Using Both Writing and Pronunciation.

Fast Extraction of Word Embedding from Q-contexts

Stacked Convolutional Deep Encoding Network for Video-Text Retrieval.

Joint Learning of Character and Word Embeddings.

Pre-Trained Multi-View Word Embedding Using Two-Side Neural Network

A Probabilistic Model for Learning Multi-Prototype Word Embeddings.

Dual-path Convolutional Image-Text Embeddings with Instance Loss

Learning Context-Specific Word/Character Embeddings.

VCWE: Visual Character-Enhanced Word Embeddings

Fusion from Decomposition: A Self-Supervised Approach for Image Fusion and Beyond

Beyond Bilingual: Multi-sense Word Embeddings using Multilingual Context

Combination Methods of Chinese Character and Word Embeddings in Deep Learning

Unified Embedding: Battle-Tested Feature Representations for Web-Scale ML Systems

Context-Specific and Multi-Prototype Character Representations.

Knowledge-Powered Deep Learning for Word Embedding

Dual-perspective fusion for word translation enhancement

Task-Specific Dependency-based Word Embedding Methods