This supplementary is to show the materials that were not presented in the letter
NIE XiuShan,CHAI YanE,LIU Ju,SUN JianDe,YIN YiLong
2016-01-01
Abstract:With the rapid development of network and multimedia technologies, users can easily generate, store, and share multiple video contents through the Internet. Similarly, numerous illegal and useless nearduplicate videos generated through simple reformatting, transformation, and editing appear on the web. These near-duplicate videos inconvenience users when they are surfing the Internet and are considered an infringement of copyright. Robust video hashing, which is also called video fingerprinting, does not require access to video contents at the time of creation and can be used to detect existing contents. In general, robust video hash is a short digest extracted from a video; however, it is robust to content-preserving attacks such as noise, logo addition, and contrast change. That is, similar to a human fingerprint that identifies a specific person, video hashing can classify video contents by extracting and comparing short digests. As such, video hashing is a feasible means to detect near-duplicate videos. The spatial information in each video frame generally represents image content in 2D, whereas temporal evolution denotes temporal information in the third dimension. The main rationale behind spatiotemporalbased methods is that they take a short video clip as a 3D cube to which some signal processing tools (e.g., DCT, RBT, and wavelet) are applied to fuse spatial and temporal information. These methods primarily consider the relation and structure along the axis direction in 3D space, in which contents may only change slightly because the objects in a short video clip nearly remain in the same position in adjacent frames. As we all know, as the number of content changes is small, the amount of information is also assumed to decrease. Consequently, existing spatiotemporal methods cannot capture rich information. On the other hand, in most cases, users first focus on the center of an image where visual saliency commonly appears. Users then gradually look at the entire image. This situation looks like a stone dropped in the water, and the water ripples spread like rings. So the ring partition performed on an image along the direction of radius is close to the perception of the human visual system. Furthermore, the content or object of an image always changes from the center to boundary. Inspired by these situations and the