Abstract:We propose a framework for automatic colorization that allows for iterative editing and modifications. The core of our framework lies in an imagination module: by understanding the content within a grayscale image, we utilize a pre-trained image generation model to generate multiple images that contain the same content. These images serve as references for coloring, mimicking the process of human experts. As the synthesized images can be imperfect or different from the original grayscale image, we propose a Reference Refinement Module to select the optimal reference composition. Unlike most previous end-to-end automatic colorization algorithms, our framework allows for iterative and localized modifications of the colorization results because we explicitly model the coloring samples. Extensive experiments demonstrate the superiority of our framework over existing automatic colorization algorithms in editability and flexibility. Project page:

What problem does this paper attempt to address?

This paper proposes an automatic colorization framework that allows for iterative editing and modification. The core lies in an imagination module, which understands the content of grayscale images and uses a pre-trained image generation model to generate multiple color reference images with the same content, simulating the coloring process of human experts. Since the synthesized images may not be perfect or different from the original grayscale image, a reference refinement module is proposed to select the best reference combination. Unlike most end-to-end automatic colorization algorithms, this framework supports iterative and localized modifications of the results, as coloring samples are explicitly defined. The experiments show that the framework outperforms existing automatic colorization algorithms in terms of editability and flexibility. The main contributions of the paper include: 1. Proposing a novel automatic colorization framework that utilizes a pre-trained diffusion model and introduces an imagination module to generate colorful reference images that are semantically similar, structurally aligned, and instance-aware, imitating human experts. 2. Demonstrating that the framework has significant controllability and user interaction ability, capable of producing diverse coloring results. 3. Achieving state-of-the-art performance and generalization ability compared to previous automatic colorization methods. The paper compares existing automatic colorization methods and points out the challenges they face in dealing with color ambiguity, generating gray or undersaturated colors, and the difficulty of modifying the results. By introducing the imagination module and reference refinement module, these problems are addressed, improving the naturalness, realism, and diversity of colorization.

Automatic Controllable Colorization via Imagination

Towards Photorealistic Colorization by Imagination

Language-based colorization of scene sketches

Control Color: Multimodal Diffusion-based Interactive Image Colorization

Exemplar-Based Image Colorization with A Learning Framework

Colorful Image Colorization

Robust And Automatic Video Colorization Via Multiframe Reordering Refinement

L-C4: Language-Based Video Colorization for Creative and Consistent Color

A learning-based approach for automatic image and video colorization

Active Colorization for Cartoon Line Drawings

Learning Representations for Automatic Colorization

Real-Time User-Guided Image Colorization with Learned Deep Priors

Multimodal Semantic-Aware Automatic Colorization with Diffusion Prior

Automatic Advertising Image Color Design Incorporating a Visual Color Analyzer

Automatic Image Colourizer

Self-driven Dual-path Learning for Reference-based Line Art Colorization under Limited Data

SCSNet: an Efficient Paradigm for Learning Simultaneously Image Colorization and Super-resolution

Affective Image Colorization

Deep Exemplar-based Colorization

DiffColor: Toward High Fidelity Text-Guided Image Colorization with Diffusion Models

Improving reference-based image colorization for line arts via feature aggregation and contrastive learning