Deep multi-task learning with flexible and compact architecture search

Jiejie Zhao,Weifeng Lv,Bowen Du,Junchen Ye,Leilei Sun,Guixi Xiong

DOI: https://doi.org/10.1007/s41060-021-00274-0

2021-07-24

International Journal of Data Science and Analytics

Abstract:Multi-task learning has been applied successfully in various applications. Recent research shows that the performance of multi-task learning methods could be improved by appropriately sharing model architectures. However, the existing work either identifies multi-task architecture manually based on prior knowledge, or simply uses an identical model structure for all tasks with a parameter sharing mechanism. In this paper, we propose a novel architecture search method to discover flexible and compact architectures for deep multi-task learning automatically, which not only extends the expressiveness of existing reinforcement learning-based neural architecture search methods, but also enhances the flexibility of existing hand-crafted multi-task learning methods. The discovered architecture shares structure and parameters adaptively to handle different levels of task relatedness, resulting in effectiveness improvement. In particular, for deep multi-task learning, we propose an architecture search space which includes a combination of partially shared modules at the low-level layer, and a set of task-specific modules with various depths at high-level layers. Secondly, a parameter generation mechanism is proposed to not only explore all possible cross-layer connections, but also reduce the search cost. Thirdly, we propose a task-specific shadow batch normalization mechanism to stabilize the training process and improve the search effectiveness. Finally, an auxiliary module is designed to guide the model training process. Experimental results demonstrate that the learned architectures outperform state-of-the-art methods with fewer learning parameters.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to automatically discover flexible and compact model architectures in deep multi - task learning to improve the effectiveness of the shared structure among different tasks. Specifically, existing multi - task learning methods either manually identify multi - task architectures based on prior knowledge or simply use the same model structure and parameter - sharing mechanism for all tasks. However, these methods are difficult to find the optimal architecture when dealing with a large number of tasks and deep networks, and may lead to performance degradation when task correlations are low. Therefore, this paper proposes a new architecture - search method, aiming to automatically discover flexible and compact architectures suitable for deep multi - task learning, which not only extends the expressive power of existing reinforcement - learning - based neural - architecture - search methods but also enhances the flexibility of existing hand - designed multi - task learning methods. The key to the paper lies in proposing an effective architecture - search space and parameter - generation mechanism, as well as a task - specific shadow batch - normalization mechanism to stabilize the training process and improve the search effect. In addition, an auxiliary module is designed to guide the model - training process. Verified by experiments, the learned architecture outperforms existing state - of - the - art methods with fewer learning parameters.

Deep multi-task learning with flexible and compact architecture search

Feature Partitioning for Efficient Multi-Task Architectures

Exploring Shared Structures and Hierarchies for Multiple NLP Tasks

Multi-task Graph Neural Architecture Search with Task-aware Collaboration and Curriculum

Fast Task-Aware Architecture Inference

Efficient Controllable Multi-Task Architectures

Automatic Multi-Task Learning Framework with Neural Architecture Search in Recommendations

Efficient Architecture Search by Network Transformation

Deep Multi-Task Learning with Shared Memory

Learning Sparse Sharing Architectures for Multiple Tasks.

Continual and Multi-Task Architecture Search

Mitigating Search Interference With Task-Aware Nested Search

Multi-Task Reinforcement Learning with Soft Modularization.

Multi-Task Multi-Agent Shared Layers are Universal Cognition of Multi-Agent Coordination

Deep Multimodal Neural Architecture Search

Efficient Architecture Search for Diverse Tasks

Multi-Objective Neural Architecture Search Based on Diverse Structures and Adaptive Recommendation

Deep Mutual Learning across Task Towers for Effective Multi-Task Recommender Learning

Cyclic Differentiable Architecture Search

Task-Aware Neural Architecture Search

Multi-Task Structural Learning using Local Task Similarity induced Neuron Creation and Removal