OSRI: A Rotationally Invariant Binary Descriptor
Xianwei Xu,Lu Tian,Jianjiang Feng,Jie Zhou
DOI: https://doi.org/10.1109/tip.2014.2324824
IF: 10.6
2014-01-01
IEEE Transactions on Image Processing
Abstract:Binary descriptors are becoming widely used in computer vision field because of their high matching efficiency and low memory requirements. Since conventional approaches, which first compute a floating-point descriptor then binarize it, are computationally expensive, some recent efforts have focused on directly computing binary descriptors from local image patches. Although these binary descriptors enable a significant speedup in processing time, their performances usually drop a lot due to orientation estimation errors and limited description abilities. To address these issues, we propose a novel binary descriptor based on the ordinal and spatial information of regional invariants (OSRIs) over a rotation invariant sampling pattern. Our main contributions are twofold: 1) each bit in OSRI is computed based on difference tests of regional invariants over pairwise sampling-regions instead of difference tests of pixel intensities commonly used in existing binary descriptors, which can significantly enhance the discriminative ability and 2) rotation and illumination changes are handled well by ordering pixels according to their intensities and gradient orientations, meanwhile, which is also more reliable than those methods that resort to a reference orientation for rotation invariance. Besides, a statistical analysis of discriminative abilities of different parts in the descriptor is conducted to design a cascade filter which can reject nonmatching descriptors at early stages by comparing just a small portion of the whole descriptor, further reducing the matching time. Extensive experiments on four challenging data sets (Oxford, 53 Objects, ZuBuD, and Kentucky) show that OSRI significantly outperforms two state-of-the-art binary descriptors (FREAK and ORB). The matching performance of OSRI with only 512 bits is also better than the well-known floating-point descriptor SIFT (4K bits) and is comparable with the state-of-the-art floating-point descriptor MROGH (6K bits), while it is two orders of magnitude faster to match than SIFT and MROGH.