Not Only Generative Art: Stable Diffusion for Content-Style Disentanglement in Art Analysis

Yankun Wu,Yuta Nakashima,Noa Garcia

DOI: https://doi.org/10.1145/3591106.3592262

2023-04-20

Abstract:The duality of content and style is inherent to the nature of art. For humans, these two elements are clearly different: content refers to the objects and concepts in the piece of art, and style to the way it is expressed. This duality poses an important challenge for computer vision. The visual appearance of objects and concepts is modulated by the style that may reflect the author's emotions, social trends, artistic movement, etc., and their deep comprehension undoubtfully requires to handle both. A promising step towards a general paradigm for art analysis is to disentangle content and style, whereas relying on human annotations to cull a single aspect of artworks has limitations in learning semantic concepts and the visual appearance of paintings. We thus present GOYA, a method that distills the artistic knowledge captured in a recent generative model to disentangle content and style. Experiments show that synthetically generated images sufficiently serve as a proxy of the real distribution of artworks, allowing GOYA to separately represent the two elements of art while keeping more information than existing methods.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the decoupling of content and style in art analysis. Specifically, the content of an art work (such as the depicted objects, figures or scenes) and the style (such as color, composition, shape and other visual forms of expression) are two basic and important elements in art analysis. However, from the perspective of computer vision, the boundary between the two is not clear, which poses a challenge to the in - depth understanding of art works. Traditional methods usually rely on manual annotation to extract a single aspect of an art work (such as content or style), which has limitations in learning semantic concepts and the visual representation of paintings. For this reason, the authors propose the GOYA method, which utilizes the artistic knowledge captured by recent generative models (such as Stable Diffusion) to decouple the content and style of art works. Verified by experiments, the synthetically generated images can serve as an effective proxy for the distribution of real art works, enabling GOYA to represent these two elements of art works separately while retaining more information. This method not only improves the understanding of the content and style of art works, but also provides new tools and perspectives for digital humanities research.

Not Only Generative Art: Stable Diffusion for Content-Style Disentanglement in Art Analysis

GOYA: Leveraging Generative Art for Content-Style Disentanglement

Creative and Diverse Artwork Generation Using Adversarial Networks

Style Fader Generative Adversarial Networks for Style Degree Controllable Artistic Style Transfer

Diverse Image Style Transfer Via Invertible Cross-Space Mapping

Preserving Structural Consistency in Arbitrary Artist and Artwork Style Transfer

Fine-Grained Control of Artistic Styles in Image Generation

Style Intervention: How to Achieve Spatial Disentanglement with Style-based Generators?

Style and Content Disentanglement in Generative Adversarial Networks

Towards Spatially Disentangled Manipulation of Face Images With Pre-Trained StyleGANs

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

D2Styler: Advancing Arbitrary Style Transfer with Discrete Diffusion Methods

Artistic Style Transfer with Internal-external Learning and Contrastive Learning

Inversion-Based Style Transfer with Diffusion Models

CSGO: Content-Style Composition in Text-to-Image Generation

Arbitrary Style Guidance for Enhanced Diffusion-Based Text-to-Image Generation

Style Generator Inversion for Image Enhancement and Animation

Measuring Style Similarity in Diffusion Models

StyleDisentangle: Disentangled Image Editing Based on StyleGAN2.

Art Creation with Multi-Conditional StyleGANs

Which Style Makes Me Attractive? Interpretable Control Discovery and Counterfactual Explanation on StyleGAN