An Adaptive Post-Processing Network with the Global-Local Aggregation for Semantic Segmentation

Guilin Zhu,Runmin Wang,Yingying Liu,Zhenlin Zhu,Changxin Gao,Li Liu,Nong Sang
DOI: https://doi.org/10.1109/tcsvt.2023.3292156
IF: 5.859
2023-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Current semantic segmentation methods mainly focus on modeling the context of the global image to obtain high-quality segmentation results. However, they ignore the role of local image patches, which contain complementary and effective context information. In this paper, we propose an adaptive post-processing network (APPNet) for semantic segmentation based on the predictions of current methods in the global image and local image patches. The key point of APPNet is the global-local aggregation module, which models the context between global predictions and local predictions to generate accurate pixel-wise representation. Furthermore, we develop an adaptive points replacement module to compensate for the lack of fine detail in global prediction and the overconfidence in local predictions. Our method can be readily integrated into existing segmentation methods (i.e., ConvNeXt, HRNet, ViT-Adapter) with little memory and without extra modification in current models. We empirically demonstrate our method brings performance improvements across diverse datasets (i.e., Cityscapes, ADE20K, PASCAL-Context, COCO-Stuff). The code and models will be publicly available at https://github.com/zhu-gl-ux/APPN.
engineering, electrical & electronic
What problem does this paper attempt to address?