Image Style Transfer Algorithm Based on Semantic Segmentation

Chuan Xie,Zhizhong Wang,Haibo Chen,Xiaolong Ma,Wei Xing,Lei Zhao,Wei Song,Zhijie Lin
DOI: https://doi.org/10.1109/access.2021.3054969
IF: 3.9
2021-01-01
IEEE Access
Abstract:Most of the existing image style transfer algorithms transfer the whole image style as a whole. Style feature is a set of correlation matrix based on style image, namely Gram matrix. Each matrix is a global description of the style image. This kind of methods can perform well in the insensitive semantic scenes (such as the style transfer between landscape photos), but in the sensitive semantic scenes (such as the style transfer between portrait photos), the problem of semantic mismatch will be highlighted, such as transferring the background texture of the style image to the foreground of the target image. Although the existing research takes the manually annotated semantic image as an input of the algorithm, and then guides the style transfer based on the semantic information, and finally achieves good results in the style transfer between portraits. But there are still two problems: first, semantic images need to be manually annotated, which costs human resources. In practical applications, large-scale image style transfer is often needed. Second, the details of the synthesized image are fuzzy, and the definition is not enough. We propose an image style transfer algorithm based on semantic segmentation to resolve semantic mismatching in image style transfer. Our algorithm extracts the semantic information of style image and content image automatically through a semantic segmentation network and uses the semantic information to guide the style transfer. Our algorithm builds a semantic segmentation network based on mask R-CNN, introduces semantic information, and then makes style transfer on the patch level, realizes the style transfer between similar objects (consistent semantic information). Experiments on Celeba and Wikiart show that our method could automatically extract the semantic information of style image and content image. Compared with the state-of-art approaches in this field, our method can effectively avoid semantic mismatch in the process of image st-le transfer. That is, it can maintain semantic consistency in the process of style transfer.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the semantic mismatch problem that occurs in image style transfer. Most of the existing image style transfer algorithms transfer the image style as a whole. These methods perform well when dealing with scenarios that are not semantically sensitive (such as style transfer between landscape photos), but when dealing with semantically sensitive scenarios (such as style transfer between portrait photos), semantic mismatch problems will occur, for example, the background texture of the style image is transferred to the foreground of the target image. Although some existing studies have achieved good results in style transfer between portraits by using manually - annotated semantic images as algorithm inputs and guiding style transfer based on semantic information, this method has two main problems: first, semantic images need to be manually annotated, which consumes a large amount of human resources; second, the details of the synthesized image are blurred and the clarity is insufficient. For this reason, the author proposes an image style transfer algorithm based on semantic segmentation (referred to as the SST algorithm for short). This algorithm automatically extracts the semantic information of the style image and the content image by constructing a semantic segmentation network based on Mask R - CNN, and uses this semantic information to guide style transfer, thereby achieving style transfer between similar objects (objects with the same semantic information). Experimental results show that, compared with the advanced methods in the current field, this method can effectively avoid the semantic mismatch problem in the image style transfer process, that is, maintain semantic consistency during the style transfer process.