Abstract:Pose estimation has been a hot topic in the field of machine vision in recent years. Animals exist widely in nature, and the analysis of their shape and movement is important in many fields and industries. In the pose estimation task, to improve the detection accuracy, the existing models often need to consume a lot of computing and memory resources. Therefore, it is a key problem for the pose estimation methods to carry out a lightweight model and reduce the computational overhead on the premise of ensuring model accuracy. In this paper, we focus on the structure of the convolutional neural network in animal pose estimation, construct a lightweight and efficient stacked hourglass network model oriented to optimize the balance of model computation and accuracy, and implement the application algorithm design based on it. Aiming at the problem of large parameters in depthwise convolutional neural networks, a lightweight residual module is proposed, that is, based on the lightweight efficient channel attention improved conditional channel-weighted method (ICCW-Bottle), thereby reducing the weight of the network and obtaining the feature information of different scales. Given the problem that a large amount of feature information is easily lost after the network pooling operation, a lightweight dual-branch fusion module is proposed that fully integrates high-level semantic information and low-level detailed features under the condition of a small number of parameters. Finally, the same as the CC-SSL method: the model is trained jointly using synthetic and real animal datasets, but the CC-SSL method does not take into account the computational power of the model, which consumes a lot of time and memory to run. Through experiments, it is known that compared with the CC-SSL method, the PCK@0.05 of this method is increased by 5.5% on the TigDog dataset. The model in this paper reduces the number of parameters and calculations of the network while ensuring less information loss and model accuracy. The ablation experiment verifies the advancement and effectiveness of the overall network.

Channel sifted model for pose estimation

X-HRNet: Towards Lightweight Human Pose Estimation with Spatially Unidimensional Self-Attention

Context-Guided Adaptive Network for Efficient Human Pose Estimation.

Simple and Lightweight Human Pose Estimation

SLBNet: Shallow and Lightweight Bilateral Network for Pose Estimation

LIGHTWEIGHT HUMAN POSE ESTIMATION UNDER RESOURCE-LIMITED SCENES

DSPNet: A low computational-cost network for human pose estimation

Implicit Decouple Network for Efficient Pose Estimation

Ghost shuffle lightweight pose network with effective feature representation and learning for human pose estimation

LSDNet: lightweight stochastic depth network for human pose estimation

Optimized S2E Attention Block based Convolutional Network for Human Pose Estimation

Towards Simple and Accurate Human Pose Estimation with Stair Network

Lightweight high-resolution network based on adaptive cross-dimensional weighting for human pose estimation

A Lightweight Network Based on Pyramid Residual Module for Human Pose Estimation

Animal Pose Estimation Algorithm Based on the Lightweight Stacked Hourglass Network

EANet: Towards Lightweight Human Pose Estimation With Effective Aggregation Network

Human Pose Estimation Based on Efficient and Lightweight High-Resolution Network (EL-HRNet)

Multi-Person Pose Estimation with Enhanced Channel-wise and Spatial Information

An improved lightweight high-resolution network based on multi-dimensional weighting for human pose estimation

Pose Estimation for Swimmers in Video Surveillance

Cascaded Pyramid Network for Multi-Person Pose Estimation