Abstract:Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the (aggregate) posterior to encourage statistical independence of the latent factors. This approach introduces a trade-off between disentangled representation learning and reconstruction quality since the model does not have enough capacity to learn correlated latent variables that capture detail information present in most image data. To overcome this trade-off, we present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method; then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables, adding detail information while maintaining conditioning on the previously learned disentangled factors. Taken together, our multi-stage modelling approach results in a single, coherent probabilistic model that is theoretically justified by the principal of D-separation and can be realized with a variety of model classes including likelihood-based models such as variational autoencoders, implicit models such as generative adversarial networks, and tractable models like normalizing flows or mixtures of Gaussians. We demonstrate that our multi-stage model has higher reconstruction quality than current state-of-the-art methods with equivalent disentanglement performance across multiple standard benchmarks. In addition, we apply the multi-stage model to generate synthetic tabular datasets, showcasing an enhanced performance over benchmark models across a variety of metrics. The interpretability analysis further indicates that the multi-stage model can effectively uncover distinct and meaningful features of variations from which the original distribution can be recovered.

Forced to Learn: Discovering Disentangled Representations Without Exhaustive Labels

Understanding Neural Networks through Representation Erasure.

Consistent Representation Learning for High Dimensional Data Analysis

Disentangled Representations in Neural Models

Learning Disentangled Representation with Pairwise Independence

Learning Disentangled Discrete Representations

Disentangling Factors of Variation in Deep Representations Using Adversarial Training.

Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling

Independence Constrained Disentangled Representation Learning from Epistemological Perspective

Learning Robust Representation for Clustering Through Locality Preserving Variational Discriminative Network

An Approach Towards Learning K-means-friendly Deep Latent Representation

Improving Deep Embedded Clustering via Learning Cluster-level Representations.

Cluster Specific Representation Learning

Self-labelling via simultaneous clustering and representation learning

Learning Disentangled Representations via Mutual Information Estimation

Neural Clustering based Visual Representation Learning

Towards a Unified Framework of Contrastive Learning for Disentangled Representations

ACE: A Coarse-to-Fine Learning Framework for Reliable Representation Learning Against Label Noise

Improving Robustness and Generality of NLP Models Using Disentangled Representations

Disentangling and Learning Robust Representations with Natural Clustering

Bridging Disentanglement with Independence and Conditional Independence Via Mutual Information for Representation Learning.