Abstract:This paper describes methods for automatically associating faces detected from multimedia documents with their names presented in the surrounding metadata. We consider the task in the image matching (IM) framework, where external Web facial images are automatically retrieved as the gallery face set of the names in advance, and a detected face is assigned to one of the names, or none of them, according to the association score between the two kinds of faces and constraints. Several important issues are investigated within the IM framework. In collecting Web facial images, beyond the basic scheme that use a celebrity name purely as the query to crawl facial images, a context-assisted image search method is proposed to enhance the relevance and discriminability of the retrieved faces. In constraint formulation, we propose an assigning-thresholding (AT) pipeline to uniformly ensure that the name-face correspondence is strictly one-to-one, and set low confidence associations as null assignments. In association score computation, we propose methods that jointly consider IM with the well-established graph-based association (GA) method at different stages, aiming at producing more accurate scores to benefit the association. Based on these efforts, an Accu-IM method performing the association as accurate as possible and a Fast-IM method performing the association in real-time are respective proposed. Extensive experiments on datasets of captioned News images and Web videos both demonstrate the advantages of the proposed efforts individually and jointly, which consistently provide improvement gains under different settings when compared with state-of-the-art methods.

Automatic Naming of Speakers in Video via Name-Face Mapping.

Learning to Name Faces

Audio-driven Talking Face Video Generation with Natural Head Pose

Context-Oriented Name-Face Association in Web Videos.

Robust Speaking Face Identification For Video Analysis

Automatic Face Naming by Learning Discriminative Affinity Matrices from Weakly Labeled Images.

AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Person

Video-driven state-aware facial animation

Automated Video Labelling: Identifying Faces by Corroborative Evidence

Improving Automatic Name-Face Association Using Celebrity Images on the Web

Cast2Face

FANS: Face Annotation by Searching Large-scale Web Facial Images.(2013). Research Collection School Of Information Systems

Name-face Association with Web Facial Image Supervision

APB2FaceV2: Real-Time Audio-Guided Multi-Face Reenactment

Automatic Name-Face Alignment to Enable Cross-Media News Retrieval

SATFace: Subject Agnostic Talking Face Generation with Natural Head Movement

Name-Face Association in Web Videos：A Large-Scale Dataset, Baselines, and Open Issues

RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network

An Automatic System for Unconstrained Video-Based Face Recognition

VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

ISSAGA-based face naming for people news retrieval