Abstract:Federated learning is a machine learning paradigm that enables decentralized clients to collaboratively learn a shared model while keeping all the training data local. While considerable research has focused on federated image generation, particularly Generative Adversarial Networks, Variational Autoencoders have received less attention. In this paper, we address the challenges of non-IID (independently and identically distributed) data environments featuring multiple groups of images of different types. Specifically, heterogeneous data distributions can lead to difficulties in maintaining a consistent latent space and can also result in local generators with disparate texture features being blended during aggregation. We introduce a novel approach, FissionVAE, which decomposes the latent space and constructs decoder branches tailored to individual client groups. This method allows for customized learning that aligns with the unique data distributions of each group. Additionally, we investigate the incorporation of hierarchical VAE architectures and demonstrate the use of heterogeneous decoder architectures within our model. We also explore strategies for setting the latent prior distributions to enhance the decomposition process. To evaluate our approach, we assemble two composite datasets: the first combines MNIST and FashionMNIST; the second comprises RGB datasets of cartoon and human faces, wild animals, marine vessels, and remote sensing images of Earth. Our experiments demonstrate that FissionVAE greatly improves generation quality on these datasets compared to baseline federated VAE models.

What problem does this paper attempt to address?

### The Problem the Paper Attempts to Solve This paper aims to address the challenge of generating high-quality images under non-independent and identically distributed (non-IID) data conditions in a Federated Learning (FL) environment. Specifically, the paper focuses on the issues encountered by Variational Autoencoders (VAEs) when dealing with multiple different types of image datasets. ### Background and Challenges 1. **Inconsistent Data Distribution**: In federated learning, the data distribution of each client may differ, leading to inconsistencies in the models of each client during training. This inconsistency, particularly for generative models like VAEs, can cause the shared latent space to be difficult to maintain consistently, thereby affecting the quality of the generated images. 2. **Generator Fusion Problem**: When aggregating the generators of each client, the significant differences in data characteristics across clients can result in generated images with mixed features, meaning the generated images contain features from different datasets. This severely impacts the authenticity and quality of the generated images. 3. **Limitations of Existing Methods**: Existing research mainly focuses on Generative Adversarial Networks (GANs), with less attention given to VAEs. Although some methods attempt to mitigate these issues by exchanging local discriminators or grouping and aggregating generators, these methods pose risks to client privacy and still fall short in generating high-quality images. ### Solution To address the above issues, the paper proposes a new model called FissionVAE. The main innovations of FissionVAE include: 1. **Latent Space Decomposition**: FissionVAE decomposes the latent space according to different data groups, with each data group corresponding to a unique prior distribution. This ensures that each client's data is mapped to its corresponding latent distribution, avoiding the mixing of latent spaces between different data groups. 2. **Customized Decoder Branches**: FissionVAE designs specialized decoder branches for each client group, allowing these branches to learn in a customized manner based on the characteristics of their respective data groups. This better preserves the unique visual features of different image types. 3. **Hierarchical Inference Architecture**: FissionVAE introduces a hierarchical inference architecture, allowing the use of deeper network structures to capture more complex data distributions. This architecture not only improves the quality of the generated images but also increases the model's flexibility, enabling clients with different computational resources to use different decoder architectures. ### Experimental Validation To validate the effectiveness of FissionVAE, the paper constructs two composite datasets: 1. **Mixed MNIST**: Combines the MNIST and FashionMNIST datasets, containing handwritten digits and clothing images, respectively. 2. **CHARM**: Includes five different datasets, namely anime faces, real human faces, animals, remote sensing images, and marine vessel images. Experimental results show that FissionVAE significantly outperforms baseline federated VAE models in terms of generation quality on these datasets, particularly in reducing the mixed features of generated images. ### Conclusion By proposing the FissionVAE model, the paper effectively addresses the challenge of generating high-quality images under non-IID data conditions in federated learning. The model significantly improves the quality and diversity of generated images through latent space decomposition, customized decoder branches, and a hierarchical inference architecture.

FissionVAE: Federated Non-IID Image Generation with Latent Space and Decoder Decomposition

Federated Variational Generative Learning for Heterogeneous Data in Distributed Environments

Federated Learning with Data-Agnostic Distribution Fusion

Communication-Efficient Federated Data Augmentation on Non-IID Data

HGMVAE: hierarchical disentanglement in Gaussian mixture variational autoencoder

FusionVAE: A Deep Hierarchical Variational Autoencoder for RGB Image Fusion

DiffuseVAE: Efficient, Controllable and High-Fidelity Generation from Low-Dimensional Latents

Multimodal hierarchical Variational AutoEncoders with Factor Analysis latent space

$ε$-VAE: Denoising as Visual Decoding

Fractal Augmented Pre-training and Gaussian Virtual Feature Calibration for Tackling Data Heterogeneity in Federated Learning

FedAA: Using Non-sensitive Modalities to Improve Federated Learning while Preserving Image Privacy

A Distributed Generative Adversarial Network for Data Augmentation under Vertical Federated Learning

High Fidelity Image Synthesis With Deep VAEs In Latent Space

FedUV: Uniformity and Variance for Heterogeneous Federated Learning

CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training

FedVAE: Communication-Efficient Federated Learning with Non-IID Private Data

Decoupling Global and Local Representations via Invertible Generative Flows

Pattern Recognition and Computer Vision, Third Chinese Conference, PRCV 2020, Nanjing, China, October 16-18, 2020, Proceedings, Part I.

Generating Diverse High-Fidelity Images with VQ-VAE-2

Decoupling Global and Local Representations From/for Image Generation

LDC-VAE: A Latent Distribution Consistency Approach to Variational AutoEncoders