A Model-Agnostic Approach to Mitigate Gradient Interference for Multi-Task Learning
Heyan Chai,Zhe Yin,Ye Ding,Li Liu,Binxing Fang,Qing Liao
DOI: https://doi.org/10.1109/tcyb.2022.3223377
IF: 11.8
2022-01-01
IEEE Transactions on Cybernetics
Abstract:Multitask learning (MTL) is a powerful technique for jointly learning multiple tasks. However, it is difficult to achieve a tradeoff between tasks during iterative training, as some tasks may compete with each other. Existing methods manually design specific network models to mitigate task conflicts, but they require considerable manual effort and prior knowledge about task relationships to tune the model so as to obtain the best performance for each task. Moreover, few works have offered formal descriptions of task conflicts and theoretical explanations for the cause of task conflict problems. In this article, we provide a formal description of task conflicts that are caused by the gradient interference problem of tasks. To alleviate this issue, we propose a novel model-agnostic approach to mitigate gradient interference (MAMG) by designing a gradient clipping rule that directly modifies the interfering components on the gradient interfering direction. Specifically, MAMG is model-agnostic and thus it can be applied to a large number of multitask models. We also theoretically prove the convergence of MAMG and its superiority to existing MTL methods. We evaluate our method on a variety of real-world large datasets, and extensive experimental results confirm that MAMG can outperform some state-of-the-art algorithms on different types of tasks and can be easily applied to various methods.
automation & control systems,computer science, cybernetics, artificial intelligence