MeGA: Merging Multiple Independently Trained Neural Networks Based on Genetic Algorithm

Daniel Yun

2024-06-28

Abstract:In this paper, we introduce a novel method for merging the weights of multiple pre-trained neural networks using a genetic algorithm called MeGA. Traditional techniques, such as weight averaging and ensemble methods, often fail to fully harness the capabilities of pre-trained networks. Our approach leverages a genetic algorithm with tournament selection, crossover, and mutation to optimize weight combinations, creating a more effective fusion. This technique allows the merged model to inherit advantageous features from both parent models, resulting in enhanced accuracy and robustness. Through experiments on the CIFAR-10 dataset, we demonstrate that our genetic algorithm-based weight merging method improves test accuracy compared to individual models and conventional methods. This approach provides a scalable solution for integrating multiple pre-trained networks across various deep learning applications. Github is available at: <a class="link-external link-https" href="https://github.com/YUNBLAK/MeGA-Merging-Multiple-Independently-Trained-Neural-Networks-Based-on-Genetic-Algorithm" rel="external noopener nofollow">this https URL</a>

Neural and Evolutionary Computing,Artificial Intelligence,Machine Learning

What problem does this paper attempt to address?

The problem this paper attempts to address is how to effectively merge the weights of multiple pre-trained neural networks to fully leverage the collective advantages of these models and improve overall performance. Traditional weight averaging and ensemble methods often fail to fully exploit the capabilities of pre-trained networks. The paper proposes a method based on genetic algorithms (referred to as MeGA), which optimizes the weight combination through selection, crossover, and mutation operations, thereby creating a more effective fusion model. This approach allows the merged model to inherit advantageous features from each parent model, enhancing accuracy and robustness. Experimental results show that, compared to individual models and traditional methods, this approach improves test accuracy on the CIFAR-10 dataset and demonstrates a scalable solution suitable for multi-model integration in various deep learning applications. Additionally, the method also shows the ability to merge weights of neural networks with different initializations and independent training, further proving its flexibility and robustness.

MeGA: Merging Multiple Independently Trained Neural Networks Based on Genetic Algorithm

Optimizing Weights by Genetic Algorithm for Neural Network Ensemble

Soft Merging: A Flexible and Robust Soft Model Merging Approach for Enhanced Neural Network Performance

Meta-GF: Training Dynamic-Depth Neural Networks Harmoniously.

Autonomously and Simultaneously Refining Deep Neural Network Parameters by a Bi-Generative Adversarial Network Aided Genetic Algorithm

A New Adaptive Merging and Growing Algorithm for Designing Artificial Neural Networks

GACNN: Training Deep Convolutional Neural Networks with Genetic Algorithm

An Auto-Parallel Method for Deep Learning Models Based on Genetic Algorithm

Deep Neural Network Fusion via Graph Matching with Applications to Model Ensemble and Federated Learning.

SUPERMERGE: An Approach For Gradient-Based Model Merging

Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts

Extending MLP ANN hyper-parameters Optimization by using Genetic Algorithm

Optimizing Multi-Instance Neural Networks Based on an Improved Genetic Algorithm

AlphaNet: Improved Training of Supernets with Alpha-Divergence

GAN Cocktail: Mixing GANs Without Dataset Access

Heterogeneous Double Populations Based Hybrid Genetic Algorithm Design For Training Feedforward Neural Networks

Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent

Safe Crossover of Neural Networks Through Neuron Alignment

Adaptively Transferring Deep Neural Networks with a Hybrid Evolution Strategy

MetaCGAN: A Novel GAN Model for Generating High Quality and Diversity Images with Few Training Data

Speeding Up EfficientNet: Selecting Update Blocks of Convolutional Neural Networks using Genetic Algorithm in Transfer Learning