Transfer Vision Patterns for Multi-Task Pixel Learning.

Xiaoya Zhang,Ling Zhou,Yong Li,Zhen Cui,Jin Xie,Jian Yang
DOI: https://doi.org/10.1145/3474085.3475501
2021-01-01
Abstract:Multi-task pixel perception is one of the most important topics in the field of machine intelligence. Inspired by the observation of cross-task interdependencies of visual patterns, we propose a multi-task vision pattern transformation (VPT) method to adaptively correlate and transfer cross-task visual patterns by leveraging the powerful transformer mechanism. To better transfer visual patterns, specifically, we build two types of pattern transformation based on the statistic prior that the affinity relations across tasks are correlated. One aims to transfer feature patterns for the integration of different task features; the other aims to exchange structure patterns for mining and leveraging the latent interaction cues. These two types of transformations are encapsulated into two VPT units, which provide universal matching interfaces for multi-task learning, complement each other to guide the transmission of feature/structure patterns, and finally realize an adaptive selection of important patterns across tasks. Extensive experiments on the joint learning of semantic segmentation, depth prediction and surface normal estimation demonstrate that our proposed method is more effective than those baselines and achieve the state-of-that-art performance in three pixel-level visual tasks.
What problem does this paper attempt to address?