Abstract:Most previous advanced backbones usually ignore the requirements on real-time and speed, which are directly affected by model size. In our study, we present a light-weight model, IDPNet. We design a dense layer and identity block parallel block as the basic block of the backbone. And we introduce an intra-level block fusion representation head to fuse high-resolution. As a result, our IDPNet decreases the number of parameters by 85.3% on both two datasets, and the GFLOPs by 80.4% and 60.7%, respectively. To extend the usability, we propose two extra variant networks IDPNet-Balance and IDPNet-Precision. We train and test our IDPNet over the COCO keypoint detection dataset and the MPII human pose dataset without pretrain. The best accuracy in both datasets is prior than previous networks. During testing process, all models can predict per image at the speed of 13 ms, 20 ms and 21 ms, respectively, and they also achieve real-time fundamentally.

IDPNet: a Light-Weight Network and Its Variants for Human Pose Estimation