Multi-depth Dilated Network for Fashion Landmark Detection with Batch-Level Online Hard Keypoint Mining.

Qirong Bu,Kai Zeng,Rui Wang,Jun Feng
DOI: https://doi.org/10.1016/j.imavis.2020.103930
IF: 3.86
2020-01-01
Image and Vision Computing
Abstract:Deep learning has been applied to fashion landmark detection in recent years, and great progress has been made. However, the detection of hard keypoints, such as those which are occluded or invisible, remains challenging and must be addressed. To tackle this problem, in the feature exaction level a novel Multi-Depth Dilated (MDD) block which is composed of different numbers of dilated convolutions in parallel and a Multi-Depth Dilated Network (MDDNet) constructed by MDD blocks are proposed in this paper, and in the training level a network training method of Batch-level Online Hard Keypoint Mining (B-OHKM) is proposed. During the training of network, each clothing keypoint is one-to-one corresponding to the related loss value calculated at that keypoint. The greater the loss of the keypoint, the more difficult it is for the network to detect that keypoint. In that way, hard keypoints can be effectively mined, so that the network can be trained in a targeted manner to improve the performance of hard keypoints. The results of experiments on two large-scale fashion benchmark datasets demonstrate that the proposed MDDNet that uses the MDD block and B-OHKM method achieves state-of-the-art results.
What problem does this paper attempt to address?