Abstract:The latent space of pre-trained generative adversarial networks (GANs) is rich in semantic information, which often becomes highly entangled. It is crucial to identify semantic directions within this latent space, as these directions correlate with image attributes and are vital for image editing tasks. Existing methods for semantic discovery usually involve labor-intensive procedures such as manual labeling and training attribute classifiers, which limits their practicality. In response to this issue, the paper proposes the Optimal Transport-based Unsupervised Semantic Disentanglement (OTUSD) algorithm. This novel method efficiently uncovers semantic directions in the latent space of GANs by utilizing the concepts of manifold learning and optimal transport (OT) theory. OTUSD applies singular value decomposition (SVD) to the OT matrix that links latent codes to generated images. This process yields singular vectors that correspond to semantically meaningful directions. Unlike traditional methods, OTUSD bypasses the need for time-consuming labeling and training processes, thus enhancing efficiency and revealing a wider array of semantically meaningful directions. Experimental results demonstrate the effectiveness of OTUSD in discovering semantic directions from several state-of-the-art GAN models, including StyleGAN, StyleGAN2, and BigGAN. This performance emphasizes the potential applicability of OTUSD to image editing and other related tasks, and illuminates its value in harnessing the manifold learning and OT mapping capabilities inherent in GANs for semantic disentanglement. The implementation code is available at https://github.com/LuckAlex/OTUSD .

ComGAN: Unsupervised Disentanglement and Segmentation via Image Composition

Compositional GAN: Learning Image-Conditional Binary Composition

Compositional GAN: Learning Conditional Image Composition

UGC: Unified GAN Compression for Efficient Image-to-Image Translation

DepGAN: Leveraging Depth Maps for Handling Occlusions and Transparency in Image Composition

Self-Ensembling GAN for Cross-Domain Semantic Segmentation

MT-GAN: toward realistic image composition based on spatial features

Semantic Segmentation by Improved Generative Adversarial Networks

SS-CPGAN: Self-Supervised Cut-and-Pasting Generative Adversarial Network for Object Segmentation

3D-SSGAN: Lifting 2D Semantics for 3D-Aware Compositional Portrait Synthesis

Semantic Segmentation with Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization

S Seg: A Three-Stage Unsupervised Foreground and Background Segmentation Network

Generating Images Part by Part with Composite Generative Adversarial Networks

SAC-GAN: Structure-Aware Image Composition

InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs

CoDeGAN: Contrastive Disentanglement for Generative Adversarial Network

Learning Segmentation Masks with the Independence Prior

Representation Decomposition for Image Manipulation and Beyond

Comp-GAN

Optimal transport-based unsupervised semantic disentanglement: A novel approach for efficient image editing in GANs

Extracting Semantic Knowledge From GANs With Unsupervised Learning

ComGAN: Unsupervised Disentanglement and ﻿Segmentation via Image Composition