Contextual Instance Decoupling for Instance-Level Human Analysis

Dongkai Wang,Shiliang Zhang
DOI: https://doi.org/10.1109/TPAMI.2023.3243223
IF: 23.6
2023-01-01
IEEE Transactions on Pattern Analysis and Machine Intelligence
Abstract:One fundamental challenge of instance-level human analysis is to decouple instances in crowded scenes, where multiple persons are overlapped with each other. This paper proposes the Contextual Instance Decoupling (CID), which presents a new pipeline of decoupling persons for multi-person instance-level analysis. Instead of relying on person bounding boxes to spatially differentiate persons, CID decouples persons in an image into multiple instance-aware feature maps. Each of those feature maps is hence adopted to infer instance-level cues for a specific person, e.g., keypoints, instance mask or part segmentation masks. Compared with bounding box detection, CID is differentiable and robust to detection errors. Decoupling persons into different feature maps also allows to isolate distractions from other persons, and explore context cues at scales larger than the bounding box size. Extensive experiments on various tasks including multi-person pose estimation, person foreground segmentation, and part segmentation, show that CID consistently outperforms previous methods in both accuracy and efficiency. For instance, it achieves 71.3% AP on CrowdPose in multi-person pose estimation, outperforming the recent single-stageDEKRby 5.6%, the bottom-up CenterAttention by 3.7%, and the top-down JC-SPPE by 5.3%. This advantage sustains on multi-person segmentation and part segmentation tasks.
What problem does this paper attempt to address?