Feature-Aware Uniform Tessellations on Video Manifold for Content-Sensitive Supervoxels

Ran Yi,Zipeng Ye,Wang Zhao,Minjing Yu,Yu-Kun Lai,Yong-Jin Liu
DOI: https://doi.org/10.1109/tpami.2020.2979714
IF: 23.6
2021-01-01
IEEE Transactions on Pattern Analysis and Machine Intelligence
Abstract:Supervoxels are perceptually meaningful atomic regions in videos, obtained by grouping voxels that exhibit coherence in both appearance and motion. In this paper, we propose content-sensitive supervoxels (CSS), which are regularly-shaped 3D primitive volumes that possess the following characteristic: they are typically larger and longer in content-sparse regions (i.e., with homogeneous appearance and motion), and smaller and shorter in content-dense regions (i.e., with high variation of appearance and/or motion). To compute CSS, we map a video Ξ to a 3-dimensional manifold M embedded in R6, whose volume elements give a good measure of the content density in Ξ. We propose an efficient Lloyd-like method with a splitting-merging scheme to compute a uniform tessellation on M, which induces the CSS in Ξ. Theoretically our method has a good competitive ratio O(1). We also present a simple extension of CSS to stream CSS for processing long videos that cannot be loaded into main memory at once. We evaluate CSS, stream CSS and seven representative supervoxel methods on four video datasets. The results show that our method outperforms existing supervoxel methods.
What problem does this paper attempt to address?