Language-based Photo Color Adjustment for Graphic Designs

Zhenwei Wang,Nanxuan Zhao,Gerhard Hancke,Rynson W.H. Lau
DOI: https://doi.org/10.1145/3592111
2023-08-06
Abstract:Adjusting the photo color to associate with some design elements is an essential way for a graphic design to effectively deliver its message and make it aesthetically pleasing. However, existing tools and previous works face a dilemma between the ease of use and level of expressiveness. To this end, we introduce an interactive language-based approach for photo recoloring, which provides an intuitive system that can assist both experts and novices on graphic design. Given a graphic design containing a photo that needs to be recolored, our model can predict the source colors and the target regions, and then recolor the target regions with the source colors based on the given language-based instruction. The multi-granularity of the instruction allows diverse user intentions. The proposed novel task faces several unique challenges, including: 1) color accuracy for recoloring with exactly the same color from the target design element as specified by the user; 2) multi-granularity instructions for parsing instructions correctly to generate a specific result or multiple plausible ones; and 3) locality for recoloring in semantically meaningful local regions to preserve original image semantics. To address these challenges, we propose a model called LangRecol with two main components: the language-based source color prediction module and the semantic-palette-based photo recoloring module. We also introduce an approach for generating a synthetic graphic design dataset with instructions to enable model training. We evaluate our model via extensive experiments and user studies. We also discuss several practical applications, showing the effectiveness and practicality of our approach. Code and data for this paper are at: <a class="link-external link-https" href="https://zhenwwang.github.io/langrecol" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Artificial Intelligence,Graphics
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve The paper aims to address the issue of adjusting photo colors in graphic design. Specifically, it proposes a language-based photo recoloring method to help users (including professionals and novices) intuitively adjust photo colors to harmonize with other elements in the design. Existing tools and methods often trade off between ease of use and expressive power, while this paper aims to develop an interactive system that is both easy to use and broadly expressive. ### Main Challenges 1. **Color Accuracy**: The colors specified by the user should match the colors in the design elements exactly, rather than being arbitrary. 2. **Multi-Granularity Instructions**: The system needs to be able to parse instructions of different granularities, from vague (e.g., "background") to specific (e.g., "yellow shape"), and generate corresponding results. 3. **Locality**: Color editing operations should be restricted to semantically meaningful local areas to preserve the semantics of the original image. ### Solution To address these challenges, the paper proposes a novel language-based photo recoloring framework called **LangRecol**. This framework consists of two main modules: 1. **Language-Based Source Color Prediction Module**: This module predicts the source color by parsing design elements and instructions. It uses a multi-task approach, combining granularity recognition and source color prediction to ensure accuracy and diversity. 2. **Semantic Palette-Based Photo Recoloring Module**: This module restricts color editing operations to local areas by predicting an initial region mask and refining it into semantically meaningful soft color layers, ultimately generating high-quality recoloring results. ### Main Contributions 1. **Designed a tool called LangRecol** for adjusting photo colors in graphic design based on multi-granularity language instructions. 2. **Proposed a multi-task model** for parsing graphic design elements and understanding multi-granularity instructions while predicting accurate source colors. 3. **Introduced a semantic palette-based method** that predicts target regions in a coarse-to-fine manner and generates high-fidelity recoloring results. 4. **Developed a method** to synthesize a reasonable graphic design dataset based on real-world design knowledge and principles to support model training. ### Related Work The paper reviews the most relevant prior work, including scribble-based image recoloring, palette-based image recoloring, example-based image recoloring, and language-based image recoloring. These methods have their pros and cons but are not entirely suitable for the task at hand. In particular, existing language-based image recoloring methods often rely on vague color instructions, failing to meet the requirement of precisely using colors from design elements. ### Summary The paper addresses the problem of adjusting photo colors in graphic design based on language instructions by proposing the **LangRecol** framework. This framework not only improves user ease of use but also ensures color accuracy and diversity, demonstrating excellent performance in practical applications.