Automatic Controllable Colorization via Imagination

Xiaoyan Cong,Yue Wu,Qifeng Chen,Chenyang Lei
2024-04-09
Abstract:We propose a framework for automatic colorization that allows for iterative editing and modifications. The core of our framework lies in an imagination module: by understanding the content within a grayscale image, we utilize a pre-trained image generation model to generate multiple images that contain the same content. These images serve as references for coloring, mimicking the process of human experts. As the synthesized images can be imperfect or different from the original grayscale image, we propose a Reference Refinement Module to select the optimal reference composition. Unlike most previous end-to-end automatic colorization algorithms, our framework allows for iterative and localized modifications of the colorization results because we explicitly model the coloring samples. Extensive experiments demonstrate the superiority of our framework over existing automatic colorization algorithms in editability and flexibility. Project page:
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper proposes an automatic colorization framework that allows for iterative editing and modification. The core lies in an imagination module, which understands the content of grayscale images and uses a pre-trained image generation model to generate multiple color reference images with the same content, simulating the coloring process of human experts. Since the synthesized images may not be perfect or different from the original grayscale image, a reference refinement module is proposed to select the best reference combination. Unlike most end-to-end automatic colorization algorithms, this framework supports iterative and localized modifications of the results, as coloring samples are explicitly defined. The experiments show that the framework outperforms existing automatic colorization algorithms in terms of editability and flexibility. The main contributions of the paper include: 1. Proposing a novel automatic colorization framework that utilizes a pre-trained diffusion model and introduces an imagination module to generate colorful reference images that are semantically similar, structurally aligned, and instance-aware, imitating human experts. 2. Demonstrating that the framework has significant controllability and user interaction ability, capable of producing diverse coloring results. 3. Achieving state-of-the-art performance and generalization ability compared to previous automatic colorization methods. The paper compares existing automatic colorization methods and points out the challenges they face in dealing with color ambiguity, generating gray or undersaturated colors, and the difficulty of modifying the results. By introducing the imagination module and reference refinement module, these problems are addressed, improving the naturalness, realism, and diversity of colorization.