MVPose: Realtime Multi-Person Pose Estimation Using Motion Vector on Mobile Devices
Jinrui Zhang,Deyu Zhang,Huan Yang,Yunxin Liu,Ju Ren,Xiaohui Xu,Fucheng Jia,Yaoxue Zhang
DOI: https://doi.org/10.1109/tmc.2021.3139940
IF: 6.075
2023-01-01
IEEE Transactions on Mobile Computing
Abstract:We present MVPose, a novel system designed to enable real-time multi-person pose estimation (PE) on commodity mobile devices, which consists of three novel techniques. First, MVPose takes a motion-vector-based approach to fast and accurately track the human keypoints across consecutive frames, rather than running expensive human-detection model and pose-estimation model for every frame. Second, MVPose designs a mobile-friendly PE model that uses lightweight feature extractors and multi-stage network to significantly reduce the latency of pose estimation without compromising the model accuracy. Third, MVPose leverages the heterogeneous computing resources of both CPU and GPU to execute the pose estimation model for multiple persons in parallel, which further reduces the total latency. We present extensive experiments to evaluate the effectiveness of the proposed tecniques by implemented the MVPose on five off-the-shelf commercial smartphones. Evaluation results show that MVPose achieves over 30 frames per second PE with 4 persons per frame, which significantly outperforms the state-of-the-art baseline, with a speedup of up to 5.7× and 3.8× in latency on CPU and GPU, respectively. Compared with baseline, MVPose achieves an improvement of 10.1% in multi-person PE accuracy. Furthermore, MVPose achieves up to 74.3% and 57.6% energy-per-frame saving on average in comparison with the baseline on mobile CPU and GPU, respectively.