Scaling Painting Style Transfer

Bruno Galerne,Lara Raad,José Lezama,Jean-Michel Morel
2024-06-26
Abstract:Neural style transfer (NST) is a deep learning technique that produces an unprecedentedly rich style transfer from a style image to a content image. It is particularly impressive when it comes to transferring style from a painting to an image. NST was originally achieved by solving an optimization problem to match the global statistics of the style image while preserving the local geometric features of the content image. The two main drawbacks of this original approach is that it is computationally expensive and that the resolution of the output images is limited by high GPU memory requirements. Many solutions have been proposed to both accelerate NST and produce images with larger size. However, our investigation shows that these accelerated methods all compromise the quality of the produced images in the context of painting style transfer. Indeed, transferring the style of a painting is a complex task involving features at different scales, from the color palette and compositional style to the fine brushstrokes and texture of the canvas. This paper provides a solution to solve the original global optimization for ultra-high resolution (UHR) images, enabling multiscale NST at unprecedented image sizes. This is achieved by spatially localizing the computation of each forward and backward passes through the VGG network. Extensive qualitative and quantitative comparisons, as well as a \textcolor{coverletter}{perceptual study}, show that our method produces style transfer of unmatched quality for such high-resolution painting styles. By a careful comparison, we show that state-of-the-art fast methods are still prone to artifacts, thus suggesting that fast painting style transfer remains an open problem. Source code is available at <a class="link-external link-https" href="https://github.com/bgalerne/scaling_painting_style_transfer" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Image and Video Processing
What problem does this paper attempt to address?
This paper discusses the problems faced by Neural Style Transfer (NST) when dealing with Ultra-High Resolution (UHR) images. Traditional NST methods match the global statistical characteristics of the style image while preserving the local geometric features of the content image, but they are computationally expensive and limited by GPU memory, resulting in limited output image resolution. Despite existing acceleration methods, they sacrifice image quality when dealing with artistic style transfer. The paper proposes a new algorithm that solves the global optimization problem of UHR images and achieves multi-scale NST, enabling the processing of unprecedented large-sized images. By performing spatially localized forward and backward propagation computations in the VGG network, this method can preserve details of different levels in the style image, including color palette, compositional style, fine brush strokes, and canvas texture. Experimental and perceptual studies demonstrate that this method produces unparalleled quality in high-resolution artistic style transfer, while existing fast methods still suffer from mismatches in artistic effects and lack of fine details. The main contributions of the paper include: 1. Proposing a two-step algorithm to compute style transfer loss gradients for UHR images that are unsuitable for GPU memory, utilizing local neural features. 2. Demonstrating that this algorithm can achieve UHR style transfer with a maximum of 20k² pixels in the multi-scale process, conveying a natural painting aesthetic at each scale. 3. Through comparative experiments, proving that the proposed UHR style transfer outperforms existing fast approximate solutions in terms of visual quality and fidelity to the style. Although this method is computationally intensive and takes several minutes to generate an image, it provides a new gold standard for practitioners pursuing the highest image quality, and its open implementation will facilitate future research on fast but approximate models.