Hier R-CNN: Instance-Level Human Parts Detection and A New Benchmark
Lu Yang,Qing Song,Zhihui Wang,Mengjie Hu,Chun Liu
DOI: https://doi.org/10.1109/tip.2020.3029901
IF: 10.6
2021-01-01
IEEE Transactions on Image Processing
Abstract:Detecting human parts at instance-level is an essential prerequisite for the analysis of human keypoints, actions, and attributes. Nonetheless, there is a lack of a large-scale, rich-annotated dataset for human parts detection. We fill in the gap by proposing COCO Human Parts. The proposed dataset is based on the COCO 2017, which is the first instance-level human parts dataset, and contains images of complex scenes and high diversity. For reflecting the diversity of human body in natural scenes, we annotate human parts with (a) location in terms of a bounding-box, (b) various type including face, head, hand, and foot, (c) subordinate relationship between person and human parts, (d) fine-grained classification into right-hand/left-hand and left-foot/right-foot. A lot of higher-level applications and studies can be founded upon COCO Human Parts, such as gesture recognition, face/hand keypoint detection, visual actions, human-object interactions, and virtual reality. There are a total of 268,030 person instances from the 66,808 images, and 2.83 parts per person instance. We provide a statistical analysis of the accuracy of our annotations. In addition, we propose a strong baseline for detecting human parts at instance-level over this dataset in an end-to-end manner, call Hier(archy) R-CNN. It is a simple but effective extension of Mask R-CNN, which can detect human parts of each person instance and predict the subordinate relationship between them. Codes and dataset are publicly available (https://github.com/soeaver/Hier-R-CNN).
computer science, artificial intelligence,engineering, electrical & electronic