Abstract:Multi-task learning (MTL) has succeeded in various industrial applications by utilizing common knowledge among joint training tasks to enhance the generalization of MTL models, resulting in improved performance across all training tasks simultaneously. Unfortunately, training all tasks simultaneously often causes performance degradation compared to single-task models since different tasks might conflict with each other. Despite existing MTL methods that aim to mitigate task conflicts by manipulating task gradients at each iteration, they ignore the potential influence of noisy data from different batches on task gradients. Consequently, the current iteration's task gradient may not accurately reflect the task itself, leading to inadequate alleviation of the dilemma of task conflicts. Moreover, existing works seldom explore the potential source of task conflicts and merely pose an assumption. In this paper, we conduct an in-depth empirical investigation into the potential sources of performance degradation of MTL and find that task gradient conflict is one of the primary reasons for the performance degradation of tasks. Then, to address the task conflicts problem, we propose a novel gradient manipulation approach, namely MoCoGrad, which manipulates task gradients by leveraging the momentum information of the task to calibrate the gradients of conflicting tasks. In addition, we derive theoretical guarantees for the con-vergence of our proposed MoCoGrad and theoretically analyze the convergence rate of MoCoGrad. Finally, to evaluate the effectiveness of MoCoGrad, extensive experiments are conducted on six real-world datasets from different domains. Our approach yields the best performance across all tasks in all six MTL benchmarks, demonstrating the effectiveness and superiority of our method.

Towards Task-Conflicts Momentum-Calibrated Approach for Multi-task Learning

A Model-Agnostic Approach to Mitigate Gradient Interference for Multi-Task Learning

Conflict-Averse Gradient Descent for Multi-task Learning

Proactive Gradient Conflict Mitigation in Multi-Task Learning: A Sparse Training Perspective

GDOD: Effective Gradient Descent using Orthogonal Decomposition for Multi-Task Learning

Dual-Balancing for Multi-Task Learning

AdaTask: A Task-aware Adaptive Learning Rate Approach to Multi-task Learning

Towards Impartial Multi-task Learning.

Gradient Coordination for Quantifying and Maximizing Knowledge Transference in Multi-Task Learning

Robust Analysis of Multi-Task Learning Efficiency: New Benchmarks on Light-Weighed Backbones and Effective Measurement of Multi-Task Learning Challenges by Feature Disentanglement

Independent Component Alignment for Multi-Task Learning

Fair Resource Allocation in Multi-Task Learning

Equitable Multi-task Learning

Examining Common Paradigms in Multi-Task Learning

Improving Multi-task Learning via Seeking Task-based Flat Regions

Improvable Gap Balancing for Multi-Task Learning

Multiple Task Learning Using Iteratively Reweighted Least Square.

Multi-Task Learning with Multi-Task Optimization

Task Grouping for Automated Multi-Task Machine Learning via Task Affinity Prediction

AdaMerging: Adaptive Model Merging for Multi-Task Learning

Modeling Output-Level Task Relatedness in Multi-Task Learning with Feedback Mechanism