Wlfdb: Weakly labeled face databases

D Wang,S Hoi,J Zhu
2014-01-01
Abstract:Auto face annotation aims to detect human faces from a facial image and tag the faces with the corresponding human names. It is a fundamental research problem and plays a critical role for many real-world applications in computer vision and pattern recognition. Instead of adopting traditional “model-based face annotation” techniques, the “search-based face annotation” recently has been gaining increasing attentions for mining large amounts of weakly labeled facial images freely available on the Internet. Although several web facial image databases have been constructed by researchers in literature (e.g., LFW, Yahoo!News, FAN-large, and PubFig), they are not suitable for addressing the search-based face annotation problem. To facilitate the research of search-based face annotation, we build WLFDB — a large-scale Weakly Labeled Face Database, with three major characteristics: (i) Large-scale Weakly Labeled Facial Images: WLFDB consists of over 6, 000 subjects and more than 700, 000 web facial images, where the images are weakly labeled with names from the query text; (ii) Rich Data Types: We make WLFDB a comprehensive testbed with three types of data: “raw web facial images”, “aligned facial images”, and “facial feature representations”. Researchers can easily use WLFDB with any type of data with minimal efforts; (iii) Benchmark Protocol: We provide a standard benchmark protocol to evaluate the performance of different search-based face annotation techniques based on the same ground truth test set. In addition, three baseline algorithms are evaluated based on the same test sets according to the hit rate metric. In summary, WLFDB is a large-scale weakly labeled facial images database that attempts to model real-world web facial images. We hope it will not only facilitate the research of search-based face annotation, but also benefit other kinds of face related research, such as face detection, alignment, verification, and recognition, etc. WLFDB is freely available to public for non-commercial research purposes at http://wlfdb.stevenhoi.com/.
What problem does this paper attempt to address?