Attention-Aware Anime Line Drawing Colorization

Yu Cao,Hao Tian,P.Y. Mok
DOI: https://doi.org/10.48550/arXiv.2212.10988
2023-02-24
Abstract:Automatic colorization of anime line drawing has attracted much attention in recent years since it can substantially benefit the animation industry. User-hint based methods are the mainstream approach for line drawing colorization, while reference-based methods offer a more intuitive approach. Nevertheless, although reference-based methods can improve feature aggregation of the reference image and the line drawing, the colorization results are not compelling in terms of color consistency or semantic correspondence. In this paper, we introduce an attention-based model for anime line drawing colorization, in which a channel-wise and spatial-wise Convolutional Attention module is used to improve the ability of the encoder for feature extraction and key area perception, and a Stop-Gradient Attention module with cross-attention and self-attention is used to tackle the cross-domain long-range dependency problem. Extensive experiments show that our method outperforms other SOTA methods, with more accurate line structure and semantic color information.
Computer Vision and Pattern Recognition,Artificial Intelligence,Graphics,Multimedia
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to color anime line drawings automatically, especially in terms of maintaining color consistency and semantic correspondence. Although the existing reference - image - based methods can improve feature aggregation to a certain extent, they still have deficiencies in color consistency and semantic correspondence. For this reason, the author proposes a model based on the attention mechanism, aiming to improve the encoder's feature extraction ability and key - region perception ability by introducing the Convolutional Attention module and the Stop - Gradient Attention module, so as to solve the above problems. Specifically, the Convolutional Attention module is used to enhance the encoder's ability to extract multi - scale features, while the Stop - Gradient Attention module deals with cross - domain long - distance dependency problems through cross - attention and self - attention mechanisms. Experimental results show that this method is superior to other state - of - the - art methods in the task of coloring anime line drawings and can generate more accurate line structures and semantic color information.