Lip Forgery Video Detection via Multi-Phoneme Selection

Jiaying Lin,Wenbo Zhou,Honggu Liu,Hang Zhou,Weiming Zhang,Nenghai Yu
2021-01-01
Abstract:Deepfake technique can produce realistic manipulation videos including full-face synthesis and local region forgery. General methods work well in detecting the former but are usually intractable in capturing local artifacts especially for lip forgery detection. In this paper, we focus on the lip forgery detection task. We first establish a robust mapping from audio to lip shapes. Then we classify the lip shapes of each video frame according to different spoken phonemes, enable the network in capturing the dissonances between lip shapes and phonemes in fake videos, increasing the interpretability. Each lip shapephoneme set is used to train a sub-model, thosewith better discriminationwill be selected to obtain an ensemble classification model. Extensive experimental results demonstrate that our method outperforms the most state-of-the-art methods on both the public DFDC dataset and a self-organized lip forgery dataset.
What problem does this paper attempt to address?