Abstract:In this study, we address the intricate challenge of multi-task dense prediction, encompassing tasks such as semantic segmentation, depth estimation, and surface normal estimation, particularly when dealing with partially annotated data (MTPSL). The complexity arises from the absence of complete task labels for each training image. Given the inter-related nature of these pixel-wise dense tasks, our focus is on mining and capturing cross-task relationships. Existing solutions typically rely on learning global image representations for global cross-task image matching, imposing constraints that, unfortunately, sacrifice the finer structures within the images. Attempting local matching as a remedy faces hurdles due to the lack of precise region supervision, making local alignment a challenging endeavor. The introduction of Segment Anything Model (SAM) sheds light on addressing local alignment challenges by providing free and high-quality solutions for region detection. Leveraging SAM-detected regions, the subsequent challenge lies in aligning the representations within these regions. Diverging from conventional methods that directly learn a monolithic image representation, our proposal involves modeling region-wise representations using Gaussian Distributions. Aligning these distributions between corresponding regions from different tasks imparts higher flexibility and capacity to capture intra-region structures, accommodating a broader range of tasks. This innovative approach significantly enhances our ability to effectively capture cross-task relationships, resulting in improved overall performance in partially supervised multi-task dense prediction scenarios. Extensive experiments conducted on two widely used benchmarks underscore the superior effectiveness of our proposed method, showcasing state-of-the-art performance even when compared to fully supervised methods.

Cross-Supervised Learning for Instance Level Multi-Task Training

In Defense Of Multi-Source Omni-Supervised Efficient Convnet For Robust Semantic Segmentation In Heterogeneous Unseen Domains

Region-aware Distribution Contrast: A Novel Approach to Multi-Task Partially Supervised Learning

Learning Multiple Dense Prediction Tasks from Partially Annotated Data

A Teacher-Student Approach to Cross-Domain Transfer Learning with Multi-level Attention

When Self-Supervised Learning Meets Scene Classification: Remote Sensing Scene Classification Based on a Multitask Learning Framework

CML-MOTS: Collaborative Multi-task Learning for Multi-Object Tracking and Segmentation

Partly Supervised Multitask Learning

Cross-dataset Training for Class Increasing Object Detection

Efficiently Identifying Task Groupings for Multi-Task Learning

Adaptive and Robust Multi-Task Learning

Multi-Task Label Discovery via Hierarchical Task Tokens for Partially Annotated Dense Predictions

Learning the Shared Subspace for Multi-task Clustering and Transductive Transfer Classification.

Multi-task Semi-supervised Learning for Pulmonary Lobe Segmentation

Multi-Task Consistency for Active Learning

Multi-Task Learning Via SA-FPN and EJ-Head

Semi-supervised Multi-task Learning for Semantics and Depth

DenseMTL: Cross-task Attention Mechanism for Dense Multi-task Learning

Task Understanding from Confusing Multi-task Data.

Cross-Domain Complementary Learning with Synthetic Data for Multi-Person Part Segmentation.

Dual Supervised Learning