Abstract:Neural style transfer is used as an optimization technique that combines two different images – a content image and a style reference image – to produce an output image that retains the appearance of the content image but has been modified to match the actual style of the style reference image. This is achieved by fine-tuning the output image to match the style reference images and the statistics for both content and style in the content image. These statistics are extracted from the images using a convolutional network. Primitive models such as WCT were improved upon by models such as PhotoWCT, whose spatial and temporal limitations were improved upon by Deep Photo Style Transfer. Eventually, wavelet transforms were introduced to perform photorealistic style transfer. A wavelet-corrected transfer based on whitening and colouring transforms, i.e., WCT2, was proposed that allowed the preservation of core content and eliminated the need for any post-processing steps and constraints. A model called Domain-Aware Universal Style Transfer also came into the picture. It supported both artistic and photorealistic style transfer. This study provides an overview of the neural style transfer technique. The recent advancements and improvements in the field, including the development of multi-scale and adaptive methods and the integration of semantic segmentation, are discussed and elaborated upon. Experiments have been conducted to determine the roles of encoder-decoder architecture and Haar wavelet functions. The optimum levels at which these can be leveraged for effective style transfer are ascertained. The study also highlights the contrast between VGG-16 and VGG-19 structures and analyzes various performance parameters to establish which works more efficiently for particular use cases. On comparing quantitative metrics across Gatys, AdaIN, and WCT, a gradual upgrade was seen across the models, as AdaIN was performing 99.92 percent better than the primitive Gatys model in terms of processing time. Over 1000 iterations, we found that VGG-16 and VGG-19 have comparable style loss metrics, but there is a difference of 73.1 percent in content loss. VGG-19, however, is displaying a better overall performance since it can keep both content and style losses at bay.

Towards efficient image and video style transfer via distillation and learnable feature transformation

Collaborative Distillation for Ultra-Resolution Universal Style Transfer

Optimal Transport of Deep Feature for Image Style Transfer

Correlation-based and Content-Enhanced Network for Video Style Transfer

Artistic Style Transfer with Internal-external Learning and Contrastive Learning

Learning Structure-Aware Transformations for Arbitrary Image Style Transfer

Style Permutation for Diversified Arbitrary Style Transfer

Diverse Image Style Transfer Via Invertible Cross-Space Mapping

Stable Video Style Transfer Based on Partial Convolution with Depth-Aware Supervision

Real-time Arbitrary Video Style Transfer

Towards Compact Reversible Image Representations for Neural Style Transfer

Structure-Guided Arbitrary Style Transfer for Artistic Image and Video

Style Creation: Multiple Styles Transfer with Incremental Learning and Distillation Loss

Image Neural Style Transfer with Preserving the Salient Regions.

DiffuseST: Unleashing the Capability of the Diffusion Model for Style Transfer

A deep learning-based neural style transfer optimization approach

Learning Self-Supervised Space-Time CNN for Fast Video Style Transfer

Real-time Localized Photorealistic Video Style Transfer

Consistent Video Style Transfer Via Compound Regularization.

A non-definitive auto-transfer mechanism for arbitrary style transfers

Decoder Network over Lightweight Reconstructed Feature for Fast Semantic Style Transfer