Abstract:While fine-tuning pretrained models has become common practice, these models often underperform outside their specific domains. Recently developed model merging techniques enable the direct integration of multiple models, each fine-tuned for distinct tasks, into a single model. This strategy promotes multitasking capabilities without requiring retraining on the original datasets. However, existing methods fall short in addressing potential conflicts and complex correlations between tasks, especially in parameter-level adjustments, posing a challenge in effectively balancing parameter competition across various tasks. This paper introduces an innovative technique named PCB-Merging (Parameter Competition Balancing), a lightweight and training-free technique that adjusts the coefficients of each parameter for effective model merging. PCB-Merging employs intra-balancing to gauge parameter significance within individual tasks and inter-balancing to assess parameter similarities across different tasks. Parameters with low importance scores are dropped, and the remaining ones are rescaled to form the final merged model. We assessed our approach in diverse merging scenarios, including cross-task, cross-domain, and cross-training configurations, as well as out-of-domain generalization. The experimental results reveal that our approach achieves substantial performance enhancements across multiple modalities, domains, model sizes, number of tasks, fine-tuning forms, and large language models, outperforming existing model merging methods. The code is publicly available at: \url{<a class="link-external link-https" href="https://github.com/duguodong7/pcb-merging" rel="external noopener nofollow">this https URL</a>}.

Parameter-efficient Weight Ensembling Facilitates Task-level Knowledge Transfer.

Towards a Unified View of Parameter-Efficient Transfer Learning

Merging Multi-Task Models via Weight-Ensembling Mixture of Experts

VMT-Adapter: Parameter-Efficient Transfer Learning for Multi-Task Dense Scene Understanding

Parameter-Efficient Transfer Learning for NLP

Parameter-efficient Tuning of Large-scale Multimodal Foundation Model

Parameter-Efficient Fine-Tuning With Adapters

Revisit Parameter-Efficient Transfer Learning: A Two-Stage Paradigm

PEMT: Multi-Task Correlation Guided Mixture-of-Experts Enables Parameter-Efficient Transfer Learning

ScaLearn: Simple and Highly Parameter-Efficient Task Transfer by Learning to Scale

Efficient Multi-Task and Transfer Reinforcement Learning with Parameter-Compositional Framework

Making Parameter-efficient Tuning More Efficient: A Unified Framework for Classification Tasks.

Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging

Boosting Inference Efficiency: Unleashing the Power of Parameter-Shared Pre-trained Language Models

$π$-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation

PALT: Parameter-Lite Transfer of Language Models for Knowledge Graph Completion

UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling

On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion

Parameter-efficient Tuning for Large Language Model Without Calculating Its Gradients

Parameter Competition Balancing for Model Merging

Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective