From Visual Search to Video Compression: A Compact Representation Framework for Video Feature Descriptors.

Xiang Zhang,Siwei Ma,Shiqi Wang,Shanshe Wang,Xinfeng Zhang,Wen Gao
DOI: https://doi.org/10.1109/dcc.2016.18
2016-01-01
Abstract:Visual feature descriptors have been successfully deployed in a wide range of applications, e.g. visual retrieval and analysis. To transmit these descriptors over bandwidth-limited networks, a high efficiency feature coding technique is highly desired to maximize compression capability and achieve compact feature representations. In this paper, a hybrid visual feature descriptor compression framework is presented and implemented in the encoding and decoding loops of texture videos. In particular, the multiple-hypothesis prediction is employed to effectively remove redundancies originated not only from spatial and temporal similarities, but also from reconstructed video frames. As the ultimate purpose of the transmitted descriptors is retrieval, the rate-accuracy optimization (RAO) technique is proposed to obtain the best tradeoff between the rate and retrieval performance. Such paradigm enables the conventional video stream to achieve high efficient retrieval/analysis with very low bitrate consumption. Moreover, we also demonstrate that texture video compression can also benefit from the additional information provided by the transmitted descriptors, leading to significantly improvement of coding efficiency on top of the high efficiency video coding (HEVC) standard. Extensive simulations have shown that the proposed method can offer significant bitrate reduction in representing both the descriptors and texture video frames, and meanwhile providing desirable retrieval performance.
What problem does this paper attempt to address?