Abstract:Understanding neural networks is crucial to creating reliable and trustworthy deep learning models. Most contemporary research in interpretability analyzes just one model at a time via causal intervention or activation analysis. Yet despite successes, these methods leave significant gaps in our understanding of the training behaviors of neural networks, how their inner representations emerge, and how we can predictably associate model components with task-specific behaviors. Seeking new insights from work in related fields, here we survey literature in the field of model merging, a field that aims to combine the abilities of various neural networks by merging their parameters and identifying task-specific model components in the process. We analyze the model merging literature through the lens of loss landscape geometry, an approach that enables us to connect observations from empirical studies on interpretability, security, model merging, and loss landscape analysis to phenomena that govern neural network training and the emergence of their inner representations. To systematize knowledge in this area, we present a novel taxonomy of model merging techniques organized by their core algorithmic principles. Additionally, we distill repeated empirical observations from the literature in these fields into characterizations of four major aspects of loss landscape geometry: mode convexity, determinism, directedness, and connectivity. We argue that by improving our understanding of the principles underlying model merging and loss landscape geometry, this work contributes to the goal of ensuring secure and trustworthy machine learning in practice.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the challenge of understanding and interpreting the internal mechanisms of neural networks, especially in the field of model merging. Specifically, the paper aims to reveal the behavior and performance when merging different neural network parameters by studying the loss landscape geometry, and propose a systematic knowledge framework to understand these phenomena. The following are the specific problems that this paper attempts to solve: 1. **Interpretability of neural networks**: Current deep - learning models are difficult to interpret the features they have learned due to their complexity and scale. This is especially important in safety - critical applications such as medical treatment and autonomous driving, because decision - making errors may lead to catastrophic consequences. Therefore, methods need to be developed to understand the internal representations and behaviors of neural networks. 2. **Technical challenges in model merging**: Existing model merging techniques face multiple challenges, including: - **Weight permutation invariance**: Layers and neurons in neural networks can be rearranged without affecting performance, which makes it difficult to directly average the parameters of different models. - **Differences in training distributions**: When the training data distributions of source models are different, the performance of the merged model will decline. - **High computational cost**: For large - scale models, the merging process can be very expensive, especially in decentralized scenarios such as federated learning. 3. **Safety and reliability**: Deep - learning models are vulnerable to adversarial attacks and may leak sensitive data or generate harmful content. By studying model merging, we can explore how to improve the safety and credibility of models. 4. **Understanding of loss landscape geometry**: Through the study of the loss landscape, we can better understand the training behavior of neural networks and the formation mechanism of their internal representations. This helps to develop more effective model merging methods and provides a new perspective for explaining model behavior. To address these problems, the paper makes the following contributions: - Proposes a taxonomy of model merging techniques based on core algorithmic principles. - Describes common phenomena in the loss landscape geometry during the training process by synthesizing data from multiple model merging studies. - Establishes the connection between model merging and model interpretability and safety research, providing directions for future research. In general, through in - depth analysis of model merging techniques and loss landscape geometry, this paper aims to improve the understanding of the internal mechanisms of neural networks, thereby promoting the development of safe, reliable and efficient deep - learning models.

SoK: On Finding Common Ground in Loss Landscapes Using Deep Model Merging Techniques

Visualizing and Understanding Neural Models in NLP

Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities

Git Re-Basin: Merging Models modulo Permutation Symmetries

Soft Merging: A Flexible and Robust Soft Model Merging Approach for Enhanced Neural Network Performance

Exploring and Exploiting the Asymmetric Valley of Deep Neural Networks

FREE-Merging: Fourier Transform for Model Merging with Lightweight Experts

Visualizing the Loss Landscape of Neural Nets

Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis

Model Merging and Safety Alignment: One Bad Model Spoils the Bunch

Arcee's MergeKit: A Toolkit for Merging Large Language Models

How to Merge Your Multimodal Models Over Time?

ZipIt! Merging Models from Different Tasks without Training

TIES-Merging: Resolving Interference When Merging Models

Evaluating Loss Landscapes from a Topology Perspective

Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion

HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models

Exploring the Geometry and Topology of Neural Network Loss Landscapes

Merging by Matching Models in Task Parameter Subspaces

Representation Similarity: A Better Guidance of DNN Layer Sharing for Edge Computing without Training

Exploring Model Kinship for Merging Large Language Models