Greit-HRNet: Grouped Lightweight High-Resolution Network for Human Pose Estimation

Junjia Han
2024-10-07
Abstract:As multi-scale features are necessary for human pose estimation tasks, high-resolution networks are widely applied. To improve efficiency, lightweight modules are proposed to replace costly point-wise convolutions in high-resolution networks, including channel weighting and spatial weighting methods. However, they fail to maintain the consistency of weights and capture global spatial information. To address these problems, we present a Grouped lightweight High-Resolution Network (Greit-HRNet), in which we propose a Greit block including a group method Grouped Channel Weighting (GCW) and a spatial weighting method Global Spatial Weighting (GSW). GCW modules group conditional channel weighting to make weights stable and maintain the high-resolution features with the deepening of the network, while GSW modules effectively extract global spatial information and exchange information across channels. In addition, we apply the Large Kernel Attention (LKA) method to improve the whole efficiency of our Greit-HRNet. Our experiments on both MS-COCO and MPII human pose estimation datasets demonstrate the superior performance of our Greit-HRNet, outperforming other state-of-the-art lightweight networks.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the trade - off problem between the efficiency and performance of high - resolution networks in the human pose estimation task. Specifically, the paper proposes a new lightweight high - resolution network (Greit - HRNet) to address the following challenges: 1. **Multi - scale feature extraction and efficiency**: - High - resolution networks are necessary in the human pose estimation task because they can better capture the details of human key points. However, traditional high - resolution networks (such as HRNet), although having superior performance, have high computational complexity and it is difficult to achieve efficient inference. - Lightweight modules (such as Lite - HRNet) reduce computational complexity, but sacrifice accuracy and performance in some cases. 2. **Weight consistency and global spatial information capture**: - Existing lightweight methods (such as conditional channel weighting and spatial weighting methods) have inconsistent weight shapes at different stages, resulting in computational redundancy and loss of high - resolution information. - When the global average pooling (GAP) operation is used for spatial weight calculation, it cannot fully extract global spatial information, ignoring pixel - level pairwise relationships and the global context. 3. **Insufficient information exchange**: - In existing lightweight structures, information exchange mainly depends on point - by - point convolution (1×1 convolution), which increases computational complexity, and the number of parameters rises sharply as the depth increases, resulting in unnecessary computational costs. To solve these problems, the paper proposes Greit - HRNet and introduces two new modules: - **Grouped Channel Weighting (GCW)**: Through grouped conditional channel weighting, it maintains the consistency of weights at different stages, while reducing computational redundancy and maintaining high - resolution features. - **Global Spatial Weighting (GSW)**: Through global spatial weighting, it fully extracts global spatial information and promotes information exchange across channels. In addition, the paper also applies the large - kernel attention (LKA) method to improve the efficiency of the entire network. Experimental results show that Greit - HRNet performs well on both the MS - COCO and MPII datasets, outperforms other lightweight networks, and achieves a better balance between performance and complexity. ### Summary The main contributions of this paper include: - Proposing a lightweight high - resolution network (Greit - HRNet), which solves the deficiencies of existing methods in weight consistency and global spatial information extraction. - Introducing two new modules, GCW and GSW, which improve the efficiency and performance of the model. - Experimental verification on multiple datasets of the effectiveness and superiority of Greit - HRNet.