Acceleration of multi-task cascaded convolutional networks

Long-Hua Ma,Hang-Yu Fan,Zhe-Ming Lu,Dong Tian
DOI: https://doi.org/10.1049/iet-ipr.2019.0141
IF: 2.3
2020-01-01
IET Image Processing
Abstract:Multi-task cascaded convolutional neural network (MTCNN) is a human face detection architecture which uses a cascaded structure with three stages (P-Net, R-Net and O-Net). The authors intend to reduce the computation time of the whole process of the MTCNN. They find that the non-maximum suppression (NMS) processes after the P-Net occupy over half of the computation time. Therefore, the authors propose a self-fine-tuning method which makes the control of computation time for the NMS process easier. Self-fine-tuning is a training trick which uses hard samples generated by P-Net to retrain P-Net. After self-fine-tuning, the distribution of human face probabilities generated by P-Net is changed, and the tail of distribution becomes thinner. The control of the number of NMS input boxes can be made easier when the distribution has a thinner tail, and choosing a suitable threshold to filter the face boxes will generate less boxes. So the computation time can be reduced. In order to keep the performance of MTCNN, the authors still propose a landmark data set augmentation, which can enhance the performance of the self-fine-tuned MTCNN. From the experiments, it is found that the proposed scheme can significantly reduce the computation time of MTCNN.
What problem does this paper attempt to address?