A Multi-Person Pose Estimation with LSTM for Video Stream

Lvcai Chen,Chunyan Yu,Li Chen
DOI: https://doi.org/10.1109/eitce47263.2019.9094979
2019-01-01
Abstract:Application of human pose estimation bring great help to people's life. Most of the applications in real life scenes are based on single-frame images. The work based on a single-frame image has better accuracy, but it often abandons some temporal information of real life. In order to preserve the information, we choose to combine the Long Short-Term Memory Network with a single-frame estimation network to carry out the multi-person pose estimation for video stream. In the design of single-frame network, this paper adds the deconvolution layers to the residual network to obtain high-resolution image information and adds a loss function with an area of bounding-box to train the single-frame model. In the design of multi-person, pose estimation network for video Stream, this paper uses the Long Short-Term Memory Network to process the temporal information extracted from the single-frame network to carry out multi-person pose estimation. In the experiment, COCO dataset and PoseTrack2018 dataset verify the effectiveness of the method.
What problem does this paper attempt to address?