HGMVAE: hierarchical disentanglement in Gaussian mixture variational autoencoder

Jiashuang Zhou,Yongqi Liu,Xiaoqin Du
DOI: https://doi.org/10.1007/s00371-024-03338-x
IF: 2.835
2024-04-02
The Visual Computer
Abstract:Recent advancements in deep neural networks have shown great potential in generating realistic data and performing clustering tasks. This is due to their ability to capture intricate patterns. However, current generative models face challenges such as poor performance and computational complexity caused by the issue of dimension disaster. The variational autoencoder (VAE), a commonly used method, also encounters problems such as posterior collapse and poor performance in multiclass classification when using the latent variables of VAE. Our goal in this study is to tackle the issue of effective disentanglement in image generation, classification and clustering tasks. We develop a generative network based on VAE incorporating a Gaussian mixture distribution as the prior. This enhancement improves the representation of latent variables and helps to overcome the challenges of matching the ground truth posterior. To further improve clustering performance, we introduce the total correlation as a kernel for computing latent features between embedding points and cluster centers. This technique is particularly useful in cases with complex latent variables and can also be applied for hierarchical disentanglement. Moreover, we employ the Fisher discriminant as a regularization term to minimize the within-class distance and maximize the between-class distance for samples, which has an important effect on the performance of our model viewed from the experimental results. We evaluate our proposed network on four datasets, and the experimental results demonstrate its effectiveness across multiple metrics.
computer science, software engineering
What problem does this paper attempt to address?