Abstract:Feed-forward CNNs trained for image transformation problems rely on loss functions that measure the similarity between the generated image and a target image. Most of the common loss functions assume that these images are spatially aligned and compare pixels at corresponding locations. However, for many tasks, aligned training pairs of images will not be available. We present an alternative loss function that does not require alignment, thus providing an effective and simple solution for a new space of problems. Our loss is based on both context and semantics -- it compares regions with similar semantic meaning, while considering the context of the entire image. Hence, for example, when transferring the style of one face to another, it will translate eyes-to-eyes and mouth-to-mouth. Our code can be found at <a class="link-external link-https" href="https://www.github.com/roimehrez/contextualLoss" rel="external noopener nofollow">this https URL</a>

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to effectively perform image transformation in image transformation tasks when the training data is non - aligned. Traditional methods usually rely on pixel - level loss functions to measure the similarity between the generated image and the target image. These methods assume that the images are spatially aligned, that is, the pixels at the same position can be compared. However, in many practical applications, such as semantic style transfer, single - image animation, puppet control, and unpaired domain transfer tasks, the training data is often non - aligned, which means that traditional pixel - level loss functions cannot be directly used. For this reason, the author proposes a new loss function - Contextual Loss. This loss function does not require the images to be spatially aligned, thus providing a simple and effective solution. The Contextual Loss is compared based on the content and semantics of the image. It not only considers the similarity of features but also the context of the entire image. Therefore, even in the case of spatial deformation between images, image transformation can be effectively performed. Specifically, the Contextual Loss is implemented in the following ways: 1. **Feature Representation**: Represent each image as a set of high - dimensional points (features). 2. **Context Similarity**: Define a context similarity measure for comparing features in two images. If a feature finds the most similar matching feature in another image, then these two features are considered context - similar. 3. **Loss Function**: Define a loss function based on context similarity. This loss function is optimized between the generated image and the target image to ensure that the generated image is similar to the target image in content and semantics. Through this method, the Contextual Loss can handle non - aligned data and has achieved excellent results in multiple image transformation tasks, such as semantic style transfer, single - image animation, puppet control, and unpaired domain transfer.

The Contextual Loss for Image Transformation with Non-Aligned Data

Incorporating Multiscale Contextual Loss for Image Style Transfer

Image Cross-Domain Translation Algorithm Based on Self-Similarity and Contrastive Learning

Loss Functions for Neural Networks for Image Processing.

Context-Based Lossless Compression of Mosaic Image with Bayer Pattern

Loss Functions for Image Restoration with Neural Networks

Deformation equivariant cross-modality image synthesis with paired non-aligned training data

Lost in Translation: Modern Neural Networks Still Struggle With Small Realistic Image Transformations

Context-Aware Optimal Transport Learning for Retinal Fundus Image Enhancement

Misalignment-Robust Frequency Distribution Loss for Image Transformation

Evolving Loss Functions for Specific Image Augmentation Techniques

Causal Context Adjustment Loss for Learned Image Compression

Learning Context-Based Nonlocal Entropy Modeling for Image Compression

Projected Distribution Loss for Image Enhancement

CC-Loss: Channel Correlation Loss For Image Classification

Learning Context-Based Non-local Entropy Modeling for Image Compression

Transform-Invariant Convolutional Neural Networks for Image Classification and Search

Contour Loss for Instance Segmentation via k-step Distance Transformation Image

Image Manipulation with Perceptual Discriminators

Random Weights Networks Work as Loss Prior Constraint for Image Restoration

Content and Colour Distillation for Learning Image Translations with the Spatial Profile Loss