PyraBiNet: A Hybrid Semantic Segmentation Network Combining PVT and BiSeNet for Deformable Objects in Indoor Environments.

Zehan Tan,Weidong Yang,Zhiwei Zhang
DOI: https://doi.org/10.1007/978-981-99-8181-6_42
2023-01-01
Abstract:In this study, we introduce PyraBiNet, an innovative hybrid model optimized for lightweight semantic segmentation tasks. This model ingeniously merges the merits of Convolutional Neural Networks (CNNs) and Transformers. We propose a dual-branch structure that strategically employs the global feature extraction capabilities of the Pyramidal Vision Transformer (PVT) and the local feature extraction proficiency of BiSeNet. Specifically, the global feature branch employs a transformer from PVT to harness high-level patterns from input images, while the local feature branch utilizes a CNN, inspired by BiSeNet, to extract fine-grained details. Comprehensive evaluations conducted on the ADE20K and DOS datasets underscore PyraBiNet's superior performance compared to the existing state-of-the-art methods. With its effective and efficient performance, PyraBiNet proves to be an invaluable asset in the domain of mobile robotics, particularly beneficial for applications such as sweeping robots. The code source and dataset are open at https://github.com/zehantan6970/PyraBiNet .
What problem does this paper attempt to address?