Deep multi-task learning with flexible and compact architecture search
Jiejie Zhao,Weifeng Lv,Bowen Du,Junchen Ye,Leilei Sun,Guixi Xiong
DOI: https://doi.org/10.1007/s41060-021-00274-0
2021-07-24
International Journal of Data Science and Analytics
Abstract:Multi-task learning has been applied successfully in various applications. Recent research shows that the performance of multi-task learning methods could be improved by appropriately sharing model architectures. However, the existing work either identifies multi-task architecture manually based on prior knowledge, or simply uses an identical model structure for all tasks with a parameter sharing mechanism. In this paper, we propose a novel architecture search method to discover flexible and compact architectures for deep multi-task learning automatically, which not only extends the expressiveness of existing reinforcement learning-based neural architecture search methods, but also enhances the flexibility of existing hand-crafted multi-task learning methods. The discovered architecture shares structure and parameters adaptively to handle different levels of task relatedness, resulting in effectiveness improvement. In particular, for deep multi-task learning, we propose an architecture search space which includes a combination of partially shared modules at the low-level layer, and a set of task-specific modules with various depths at high-level layers. Secondly, a parameter generation mechanism is proposed to not only explore all possible cross-layer connections, but also reduce the search cost. Thirdly, we propose a task-specific shadow batch normalization mechanism to stabilize the training process and improve the search effectiveness. Finally, an auxiliary module is designed to guide the model training process. Experimental results demonstrate that the learned architectures outperform state-of-the-art methods with fewer learning parameters.