Self-Supervised Human Pose Based Multi-Camera Video Synchronization

Liqiang Yin,Ruize Han,Wei Feng,Song Wang
DOI: https://doi.org/10.1145/3503161.3547766
2022-01-01
Abstract:Multi-view video collaborative analysis is an important task and has many applications in multimedia community. However, it always requires the given multiple videos to be temporally synchronized. Existing methods commonly synchronize the videos by the wired communication, which may hinder the practical application in real world, especially for moving cameras. In this paper, we focus on the human-centric video analysis and propose a self-supervised framework for the automatic multi-camera video synchronization. Specifically, we develop SeSyn-Net with the 2D human pose as input for feature embedding and design a series of self-supervised losses to effectively extract the view-invariant but time-discriminative representation for video synchronization. We also build two new datasets for the performance evaluation. Extensive experimental results verify the effectiveness of our method, which achieves the superior performance compared to both the classical and state-of-the-art methods.
What problem does this paper attempt to address?