Clustering lightened deep representation for large scale face identification.

Shilun Lin,Zhicheng Zhao,Fei Su
DOI: https://doi.org/10.1145/3018896.3025149
2017-01-01
Abstract:On specific face dataset, such as the LFW benchmark, recent face recognition methods have achieved near perfect accuracy. However, the face identification is still a challenging task for a super large scale dataset, where a real application is urgently needed, thus Microsoft challenge of recognizing one million celebrities (MS-Celeb-1M) has attracted an increasing attention. In this paper, we propose a three-step strategy to address this problem. Firstly, based on a corss-domain face dataset, i.e., the CASIA-Web dataset, an efficient and deliberate face representation model with a Max-Feature-Map (MFM) activation function is trained to map raw images into the feature space quickly. Secondly, face representations with the same MID in MS-Celeb-1M are clustered into three subsets: a pure set, a hard set and a mess set. The cluster centers are used as gallery representations of the corresponding MID and this scheme reduces the impact of noisy images and the number of comparisons during the face matching. Finally, locality sensitive hashing (LSH) algorithm is applied to speed up the search of the nearest centroid. Experimental results show that our face CNN model can extract stable and discriminative face representations, and the proposed three-step strategy achieves a promising performance without any manual selection for the MS-Celeb-1M dataset. Furthermore, we find that via clustering a relatively pure set is kept by many MIDs in MS-Celeb-1M, which indicats this scheme is effective for cleaning a huge but mess dataset.
What problem does this paper attempt to address?