Keypoint based weakly supervised human parsing

Zhonghua Wu,Guosheng Lin,Jianfei Cai
DOI: https://doi.org/10.1016/j.imavis.2019.08.005
IF: 3.86
2019-01-01
Image and Vision Computing
Abstract:Fully convolutional networks (FCN) have achieved great success in human parsing in recent years. In conventional human parsing tasks, pixel-level labeling is required for guiding the training, which usually involves enormous human labeling efforts. To ease the labeling efforts, we propose a novel weakly supervised human parsing method which only requires simple object keypoint annotations for learning. We develop an iterative learning method to generate pseudo part segmentation masks from keypoint labels. With these pseudo masks, we train a FCN network to output pixel-level human parsing predictions. Furthermore, we develop a correlation network to perform joint prediction of part and object segmentation masks and improve the segmentation performance. The experiment results show that our weakly supervised method is able to achieve very competitive human parsing results. Despite that our method only uses simple keypoint annotations for learning, we are able to achieve comparable performance with fully supervised methods which use the expensive pixel-level annotations.
What problem does this paper attempt to address?