Analytical Uncertainty-Based Loss Weighting in Multi-Task Learning

Lukas Kirchdorfer,Cathrin Elich,Simon Kutsche,Heiner Stuckenschmidt,Lukas Schott,Jan M. Köhler
2024-08-15
Abstract:With the rise of neural networks in various domains, multi-task learning (MTL) gained significant relevance. A key challenge in MTL is balancing individual task losses during neural network training to improve performance and efficiency through knowledge sharing across tasks. To address these challenges, we propose a novel task-weighting method by building on the most prevalent approach of Uncertainty Weighting and computing analytically optimal uncertainty-based weights, normalized by a softmax function with tunable temperature. Our approach yields comparable results to the combinatorially prohibitive, brute-force approach of Scalarization while offering a more cost-effective yet high-performing alternative. We conduct an extensive benchmark on various datasets and architectures. Our method consistently outperforms six other common weighting methods. Furthermore, we report noteworthy experimental findings for the practical application of MTL. For example, larger networks diminish the influence of weighting methods, and tuning the weight decay has a low impact compared to the learning rate.
Machine Learning,Artificial Intelligence,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve a key challenge in multi - task learning (MTL): how to balance the losses of each task during the neural network training process to improve performance and efficiency. Specifically, the author proposes a new method based on uncertainty weighting, called Soft Optimal Uncertainty Weighting (UW - SO), to address the following problems: 1. **Imbalanced task losses**: - Different tasks may use different loss functions (such as L1 loss and cross - entropy loss), resulting in different loss scales. - Some tasks are more difficult than others and require more resources. - Even when using the same loss function, data noise and prediction uncertainty will also lead to different loss magnitudes among tasks. 2. **Limitations of existing methods**: - The equal weighting method (EW) may cause some tasks to dominate the training process. - Dynamic weighting methods (such as Uncertainty Weighting, UW) are vulnerable to poor initialization and overfitting. - The brute - force search method (such as Scalarization), although having good performance, has a too - high computational cost and is difficult to apply in practical scenarios. ### Solutions To solve the above problems, the author proposes the following solutions: 1. **Soft Optimal Uncertainty Weighting (UW - SO)**: - Based on the Uncertainty Weighting (UW) method, the optimal uncertainty weighting value is analytically derived and normalized by the softmax function with a temperature parameter. - This method only needs to adjust one hyperparameter (the temperature parameter \( T \)) and can achieve results comparable to or better than Scalarization with a significant reduction in computational steps. 2. **Extensive experimental verification**: - A large number of experiments were carried out on multiple datasets (such as NYUv2, Cityscapes, and CelebA) and different architectures to verify the effectiveness of the UW - SO method. - The experimental results show that UW - SO performs excellently in various tasks and network structures, especially in large - scale networks, where its performance is better than other common weighting methods. ### Main contributions - Proposed a new weighting method UW - SO based on uncertainty, which solves the initialization and overfitting problems of existing methods. - Verified the effectiveness and superiority of UW - SO through extensive experiments. - Provided some important observations in MTL practice, such as larger networks will reduce the performance differences between weighting methods, and the importance of learning rate adjustment. Through these contributions, the author hopes to provide a more efficient and easy - to - apply task - weighting method for the field of multi - task learning.