Bridging the Gap: Sketch to Color Diffusion Model with Semantic Prompt Learning.

Ning Wang,Yifei She,Rui Xu,Bin Liu,Haojie Li,Zhiyong Wang,Zhihui Wang
DOI: https://doi.org/10.1109/ICASSP48485.2024.10448330
2024-01-01
Abstract:Automatic anime sketch colorization aims to generate a color image from a sketch image, which is challenging due to limited structure and semantic understanding, leading to constrained style, and semantic color inconsistency. In this paper, we introduce a sketch to color diffusion model with semantic prompt learning (SPL), learning better semantic prompts to stimulate the powerful structure and semantic understanding capabilities of large-scale multi-modal diffusion models, effectively bridging the gap between sketch and color. We introduce two distillation strategies for learning semantic prompts: one is prediction-level distillation by optimizing the global knowledge distillation loss and the local activation knowledge distillation loss, and the other is feature-level distillation, which optimizes hierarchy-wise feature distillation loss to transfer knowledge to output features of different hierarchies in the model. The experimental results show that our proposed distillation strategies generate high-quality semantic prompts, resulting in image quality that exhibits a superior visual effect compared to current automatic anime sketch colorization methods.
What problem does this paper attempt to address?