Abstract:Different conflicting optimization criteria arise naturally in various Deep Learning scenarios. These can address different main tasks (i.e., in the setting of Multi-Task Learning), but also main and secondary tasks such as loss minimization versus sparsity. The usual approach is a simple weighting of the criteria, which formally only works in the convex setting. In this paper, we present a Multi-Objective Optimization algorithm using a modified Weighted Chebyshev scalarization for training Deep Neural Networks (DNNs) with respect to several tasks. By employing this scalarization technique, the algorithm can identify all optimal solutions of the original problem while reducing its complexity to a sequence of single-objective problems. The simplified problems are then solved using an Augmented Lagrangian method, enabling the use of popular optimization techniques such as Adam and Stochastic Gradient Descent, while efficaciously handling constraints. Our work aims to address the (economical and also ecological) sustainability issue of DNN models, with a particular focus on Deep Multi-Task models, which are typically designed with a very large number of weights to perform equally well on multiple tasks. Through experiments conducted on two Machine Learning datasets, we demonstrate the possibility of adaptively sparsifying the model during training without significantly impacting its performance, if we are willing to apply task-specific adaptations to the network weights. Code is available at <a class="link-external link-https" href="https://github.com/salomonhotegni/MDMTN" rel="external noopener nofollow">this https URL</a>

What problem does this paper attempt to address?

### Problems the paper attempts to solve The paper aims to solve the problems of multi - objective optimization (MOO) in deep learning, especially in the scenario of multi - task learning (MTL). Specifically, the paper focuses on the following aspects: 1. **Multi - objective optimization in multi - task learning**: - In multi - task learning, different tasks may have different optimization objectives, and these objectives are often in conflict with each other. For example, one task may need to minimize the loss function, while another task may need to increase the sparsity of the model to reduce the computational complexity. - The traditional approach is to combine multiple objectives into a single objective function by weighted summation, but this often fails to find the optimal solution in non - convex cases. 2. **Model sparsification**: - Deep neural networks (DNN) are usually designed to be very large in order to perform well on multiple tasks. However, such large models are not only computationally expensive, but also economically and ecologically unsustainable. - The paper proposes a method to adaptively sparsify the model during the training process, thereby reducing the complexity and computational cost of the model without significantly affecting the model's performance. 3. **Multi - objective optimization algorithms**: - The paper proposes a multi - objective optimization algorithm based on the modified weighted Chebyshev scalarization. Through this method, all the optimal solutions of the original problem can be effectively found and simplified into a series of single - objective optimization problems. - The augmented Lagrangian method is used to solve these simplified single - objective optimization problems, so that popular optimization techniques (such as Adam and stochastic gradient descent) can be utilized. 4. **Experimental verification**: - The paper conducts experiments on two datasets, namely MultiMNIST and Cifar10Mnist. Through these experiments, the effectiveness and feasibility of the proposed multi - objective optimization method in multi - task learning are verified. ### Main contributions 1. **Proposing a new multi - objective optimization algorithm**: - Based on the modified weighted Chebyshev scalarization and the augmented Lagrangian method, it can effectively handle the multi - objective optimization problems in multi - task learning. 2. **Model sparsification**: - By introducing the Group Ordered Weighted l1 (GrOWL) regularization term, the adaptive sparsification of the model is achieved, reducing the complexity and computational cost of the model. 3. **Experimental verification**: - Experiments are carried out on two datasets to verify the effectiveness of the proposed method. The experimental results show that the number of model parameters and computational complexity can be significantly reduced while maintaining the performance of the model. ### Conclusion By proposing a new multi - objective optimization algorithm, the paper successfully solves the optimization problems of multiple conflicting objectives in multi - task learning, and improves the economic and ecological sustainability of the model through the adaptive sparsification technique. The experimental results further verify the effectiveness and feasibility of this method.

Multi-Objective Optimization for Sparse Deep Multi-Task Learning

Multi-Task Learning as Multi-Objective Optimization

Multi-Adaptive Optimization for multi-task learning with deep neural networks

Analytical Uncertainty-Based Loss Weighting in Multi-Task Learning

The Multi-Task Learning with an Application of Pareto Improvement.

Multi-Task Learning with Multi-Task Optimization

Do Current Multi-Task Optimization Methods in Deep Learning Even Help?

Giving each task what it needs -- leveraging structured sparsity for tailored multi-task learning

No More Tuning: Prioritized Multi-Task Learning with Lagrangian Differential Multiplier Methods

Efficient and sparse neural networks by pruning weights in a multiobjective learning approach

A New Multitask Joint Learning Framework for Expensive Multi-Objective Optimization Problems

Pareto Multi-Task Learning

Three-Way Trade-Off in Multi-Objective Learning: Optimization, Generalization and Conflict-Avoidance

Multi-task Sparse Structure Learning

Multi-objective Deep Learning: Taxonomy and Survey of the State of the Art

Robust Analysis of Multi-Task Learning Efficiency: New Benchmarks on Light-Weighed Backbones and Effective Measurement of Multi-Task Learning Challenges by Feature Disentanglement

A Multi-task Learning Approach by Combining Derivative-Free and Gradient Methods.

Proactive Gradient Conflict Mitigation in Multi-Task Learning: A Sparse Training Perspective

Dual-Balancing for Multi-Task Learning

Proximal Multitask Learning Over Distributed Networks With Jointly Sparse Structure

Towards Impartial Multi-task Learning.