Abstract:This paper introduces UnZipLoRA, a method for decomposing an image into its constituent subject and style, represented as two distinct LoRAs (Low-Rank Adaptations). Unlike existing personalization techniques that focus on either subject or style in isolation, or require separate training sets for each, UnZipLoRA disentangles these elements from a single image by training both the LoRAs simultaneously. UnZipLoRA ensures that the resulting LoRAs are compatible, i.e., they can be seamlessly combined using direct addition. UnZipLoRA enables independent manipulation and recontextualization of subject and style, including generating variations of each, applying the extracted style to new subjects, and recombining them to reconstruct the original image or create novel variations. To address the challenge of subject and style entanglement, UnZipLoRA employs a novel prompt separation technique, as well as column and block separation strategies to accurately preserve the characteristics of subject and style, and ensure compatibility between the learned LoRAs. Evaluation with human studies and quantitative metrics demonstrates UnZipLoRA's effectiveness compared to other state-of-the-art methods, including DreamBooth-LoRA, Inspiration Tree, and B-LoRA.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the problem of separating content (or subject) and style from a single image. Specifically, the author proposes a method named **UnZipLoRA**, which can decompose an image into two independent low - rank adaptation (LoRA) models: one for representing the content (or subject) of the image, and the other for representing the style of the image. These two LoRA models can be simultaneously learned during the training process, and can independently generate new images, or be recombined to create new variants of the original image. #### Main challenges: 1. **Multi - task learning under single - image supervision**: Traditional personalization techniques usually focus on one aspect of content or style, or require separate training sets to learn content and style separately. However, the goal of UnZipLoRA is to simultaneously learn content and style from a single image. 2. **Disentanglement problem**: How to ensure that the two LoRA models do not interfere with each other during the training process, so as to accurately capture the concepts of content and style. 3. **Compatibility**: Ensure that the learned content LoRA and style LoRA can be seamlessly combined, so that high - quality images can be generated by direct addition during inference. #### Solutions: To address these challenges, UnZipLoRA introduces the following key techniques: 1. **Prompt Separation**: Use different prompts to train the content LoRA and style LoRA respectively, avoiding cross - contamination. 2. **Column Separation**: By dynamically allocating columns in the weight matrix, ensure the orthogonality between the content and style LoRA, reducing interference. 3. **Block Separation**: According to the sensitivity of different blocks of U - Net to content and style, adjust the training strategies of content LoRA and style LoRA respectively, further improving the accuracy of details. Through these methods, UnZipLoRA can successfully disentangle content and style on a single image and generate high - quality image variants. Experimental results show that UnZipLoRA outperforms other state - of - the - art methods, such as DreamBooth - LoRA, Inspiration Tree and B - LoRA, in both human studies and quantitative evaluations. ### Summary The core contribution of UnZipLoRA lies in that it provides a novel and effective method that can separate content and style from a single image, and can flexibly manipulate and recombine these elements, thus providing new possibilities for artistic creation and personalized image generation.

UnZipLoRA: Separating Content and Style from a Single Image

ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs

Implicit Style-Content Separation using B-LoRA

LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style Conditioned Image Generation

Image Re-composition Via Regional Content-Style Decoupling.

Multi-LoRA Composition for Image Generation

TriLoRA: Integrating SVD for Advanced Style Personalization in Text-to-Image Generation

Block-wise LoRA: Revisiting Fine-grained LoRA for Effective Personalization and Stylization in Text-to-Image Generation

CLoRA: A Contrastive Approach to Compose Multiple LoRA Models

LoRA Fusion: Enhancing Image Generation

LoRA of Change: Learning to Generate LoRA for the Editing Instruction from A Single Before-After Image Pair

LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models

DiffStyler: Diffusion-based Localized Image Style Transfer

ResLoRA: Identity Residual Mapping in Low-Rank Adaption

Image Synthesis from Layout with Locality-Aware Mask Adaption

Latents2Semantics: Leveraging the Latent Space of Generative Models for Localized Style Manipulation of Face Images

In-Context LoRA for Diffusion Transformers

ZePo: Zero-Shot Portrait Stylization with Faster Sampling

LoCo: Locally Constrained Training-Free Layout-to-Image Synthesis

LoRA-IR: Taming Low-Rank Experts for Efficient All-in-One Image Restoration

Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to Extremes Through Rank-Wise Clustering