Porn Streamer Recognition in Live Video Based on Multimodal Knowledge Distillation

Wang Liyuan,Zhang Jing,Yao Jiacheng,Zhuo Li
DOI: https://doi.org/10.1049/cje.2021.07.027
IF: 1.019
2021-01-01
Chinese Journal of Electronics
Abstract:Although deep learning has reached a higher accuracy for video content analysis, it is not satisfied with practical application demands of porn streamer recognition in live video because of multiple parameters, complex structures of deep network model. In order to improve the recognition efficiency of porn streamer in live video, a deep network model compression method based on multimodal knowledge distillation is proposed. First, the teacher model is trained with visual-speech deep network to obtain the corresponding porn video prediction score. Second, a lightweight student model constructed with MobileNetV2 and Xception transfers the knowledge from the teacher model by using multimodal knowledge distillation strategy. Finally, porn streamer in live video is recognized by combining the lightweight student model of visualspeech network with the bullet screen text recognition network. Experimental results demonstrate that the proposed method can effectively drop the computation cost and improve the recognition speed under the proper accuracy.
What problem does this paper attempt to address?