Ad-Net: Audio-Visual Convolutional Neural Network for Advertisement Detection In Videos

Shervin Minaee,Imed Bouazizi,Prakash Kolan,Hossein Najafzadeh
DOI: https://doi.org/10.48550/arXiv.1806.08612
2018-06-22
Computer Vision and Pattern Recognition
Abstract:Personalized advertisement is a crucial task for many of the online businesses and video broadcasters. Many of today's broadcasters use the same commercial for all customers, but as one can imagine different viewers have different interests and it seems reasonable to have customized commercial for different group of people, chosen based on their demographic features, and history. In this project, we propose a framework, which gets the broadcast videos, analyzes them, detects the commercial and replaces it with a more suitable commercial. We propose a two-stream audio-visual convolutional neural network, that one branch analyzes the visual information and the other one analyzes the audio information, and then the audio and visual embedding are fused together, and are used for commercial detection, and content categorization. We show that using both the visual and audio content of the videos significantly improves the model performance for video analysis. This network is trained on a dataset of more than 50k regular video and commercial shots, and achieved much better performance compared to the models based on hand-crafted features.
What problem does this paper attempt to address?