Abstract:This paper challenges the common assumption that the weight β, in β-VAE, should be larger than 1 in order to effectively disentangle latent factors. We demonstrate that β-VAE, with β<1, can not only attain good disentanglement but also significantly improve reconstruction accuracy via dynamic control. The paper \textit{removes the inherent trade-off} between reconstruction accuracy and disentanglement for β-VAE. Existing methods, such as β-VAE and FactorVAE, assign a large weight to the KL-divergence term in the objective function, leading to high reconstruction errors for the sake of better disentanglement. To mitigate this problem, a ControlVAE has recently been developed that dynamically tunes the KL-divergence weight in an attempt to \textit{control the trade-off} to more a favorable point. However, ControlVAE fails to eliminate the conflict between the need for a large β (for disentanglement) and the need for a small β (for smaller reconstruction error). Instead, we propose DynamicVAE that maintains a different β at different stages of training, thereby \textit{decoupling disentanglement and reconstruction accuracy}. In order to evolve the weight, β, along a trajectory that enables such decoupling, DynamicVAE leverages a modified incremental PI (proportional-integral) controller, a variant of proportional-integral-derivative controller (PID) algorithm, and employs a moving average as well as a hybrid annealing method to evolve the value of KL-divergence smoothly in a tightly controlled fashion. We theoretically prove the stability of the proposed approach. Evaluation results on three benchmark datasets demonstrate that DynamicVAE significantly improves the reconstruction accuracy while achieving disentanglement comparable to the best of existing methods. The results verify that our method can separate disentangled representation learning and reconstruction, removing the inherent tension between the two.

mcVAE: disentangling by mean constraint

DynamicVAE: Decoupling Reconstruction Error and Disentangled Representation Learning

Challenging $\beta$-VAE with $\beta < 1$ for Disentanglement Via Dynamic Learning

Rethinking Controllable Variational Autoencoders

Improving disentanglement in variational auto-encoders via feature imbalance-informed dimension weighting

$α$-TCVAE: On the relationship between Disentanglement and Diversity

Overlooked Implications of the Reconstruction Loss for VAE Disentanglement

Facial Landmark Disentangled Network with Variational Autoencoder

Disentanglement with Factor Quantized Variational Autoencoders

Variantional autoencoder with decremental information bottleneck for disentanglement

Guided Variational Autoencoder for Disentanglement Learning

Closed-Loop Unsupervised Representation Disentanglement with $β$-VAE Distillation and Diffusion Probabilistic Feedback

ControlVAE: Controllable Variational Autoencoder

DAVA: Disentangling Adversarial Variational Autoencoder

Bridging Disentanglement with Independence and Conditional Independence via Mutual Information for Representation Learning

Hidden Talents of the Variational Autoencoder

PRI-VAE: Principle-of-Relevant-Information Variational Autoencoders

Bridging Disentanglement with Independence and Conditional Independence Via Mutual Information for Representation Learning.

How to train your VAE

eVAE: Evolutionary Variational Autoencoder

Disentangling shared and private latent factors in multimodal Variational Autoencoders