Abstract:Text-to-image diffusion models have demonstrated an unparalleled ability to generate high-quality, diverse images from a textual prompt. However, the internal representations learned by these models remain an enigma. In this work, we present Conceptor, a novel method to interpret the internal representation of a textual concept by a diffusion model. This interpretation is obtained by decomposing the concept into a small set of human-interpretable textual elements. Applied over the state-of-the-art Stable Diffusion model, Conceptor reveals non-trivial structures in the representations of concepts. For example, we find surprising visual connections between concepts, that transcend their textual semantics. We additionally discover concepts that rely on mixtures of exemplars, biases, renowned artistic styles, or a simultaneous fusion of multiple meanings of the concept. Through a large battery of experiments, we demonstrate Conceptor's ability to provide meaningful, robust, and faithful decompositions for a wide variety of abstract, concrete, and complex textual concepts, while allowing to naturally connect each decomposition element to its corresponding visual impact on the generated images. Our code will be available at: <a class="link-external link-https" href="https://hila-chefer.github.io/Conceptor/" rel="external noopener nofollow">this https URL</a>

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the difficult task of understanding the internal representations of text - to - image diffusion models. Although these models are excellent at generating high - quality and diverse images based on text prompts, their internally learned representations remain a mystery. Specifically, most of the existing research relies on simple evaluations of external output images, and the understanding and interpretation of the model's internal representations are still very limited. To address this problem, the paper proposes the CONCEPTOR method, which aims to interpret the concepts of the internal representations of text - to - image diffusion models. By decomposing concepts into a set of human - interpretable text elements, CONCEPTOR can reveal non - trivial structures in the concept representations, such as visual connections between different concepts, the influence of famous art styles, and the mixing of different meanings of homonyms. ### Main Contributions 1. **Proposing CONCEPTOR**: A new method for decomposing text concepts into a set of interpretable elements. This method utilizes a linear combination mapping from the text embedding space to coefficients. 2. **Revealing Deep Connections between Concepts**: Discovered some visual connections between concepts that go beyond text associations. 3. **Discovering Complex Internal Structures**: Including interpolation examples, relying on famous art styles, and simultaneously integrating multiple meanings of concepts. 4. **Detecting Hard - to - Detect Biases**: Discovered biases that are not easily detected through visual observation, which is helpful for fact - based discussions of important ethical issues. ### Method Overview CONCEPTOR is implemented through the following steps: - **Pseudo - Token Learning**: Map each word in the vocabulary to the corresponding coefficient through a neural network to denoise the concept image. Pseudo - tokens are constructed by weighted linear combinations of top - level vocabulary elements. - **Optimization Objective**: Optimize the MLP by minimizing the reconstruction loss and sparsity loss to ensure that the pseudo - tokens can mimic the denoising process of the concept image and are mainly dominated by the top - level vocabulary elements. - **Single - Image Decomposition**: For a specific generated image, extract its main contributing elements to reveal visual connections. ### Experimental Results Through extensive experiments, the authors demonstrated the effectiveness of CONCEPTOR in interpreting various abstract, concrete, and complex text concepts. The experimental results show that CONCEPTOR can provide meaningful, robust, and faithful decompositions and significantly outperforms the baseline methods on multiple metrics. In conclusion, this paper fills the gap in the field of interpreting the internal representations of text - to - image diffusion models by proposing the CONCEPTOR method, providing an important tool for further understanding and improving these models.

The Hidden Language of Diffusion Models

Diffusion Explainer: Visual Explanation for Text-to-image Stable Diffusion

Visual Concept-driven Image Generation with Text-to-Image Diffusion Model

Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models

Unveiling Concept Attribution in Diffusion Models

eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers

Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines

Create Your World: Lifelong Text-to-Image Diffusion

Isolated Diffusion: Optimizing Multi-Concept Text-to-Image Generation Training-Freely with Isolated Diffusion Guidance

Interactive Visual Learning for Stable Diffusion

Multi-Concept Customization of Text-to-Image Diffusion

Prompt-Free Diffusion: Taking "text" out of Text-to-Image Diffusion Models

Lost in Translation: Latent Concept Misalignment in Text-to-Image Diffusion Models

ConceptLab: Creative Concept Generation using VLM-Guided Diffusion Prior Constraints

Your Diffusion Model is Secretly a Zero-Shot Classifier

Decoding Diffusion: A Scalable Framework for Unsupervised Analysis of Latent Space Biases and Representations Using Natural Language Prompts

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

Are Diffusion Models Vision-And-Language Reasoners?

Reverse Stable Diffusion: What prompt was used to generate this image?

Unified Concept Editing in Diffusion Models

Do text-free diffusion models learn discriminative visual representations?