Mean-Removed Product Quantization for Large-scale Image Retrieval

Jiacheng Yang,Bin Chen,Shu-Tao Xia
DOI: https://doi.org/10.1016/j.neucom.2020.04.026
IF: 6
2020-01-01
Neurocomputing
Abstract:Product quantization (PQ) and its variations are popular and attractive in approximating nearest neighbor search (ANN) due to their lower memory usage and faster retrieval speed. PQ belongs to a divide-and-conquer algorithm, which decomposes the high-dimensional vector space into disjoint low-dimensional subspaces and employs k-means on them. However, when there is large variance in the average amplitude of the components of the data points, directly utilizing PQ on the data points would result in poor performance. To the end, we propose a novel approach, namely, mean-removed product quantization (MRPQ) to address this issue in this paper. In fact, the average amplitude of a data point or the mean of a data point can be regarded as statistically independent on its residuals. Thus we can learn a separate scalar quantizer of the means of the data points and apply the PQ to their residuals. It is worth noting that our approach can achieve substantial improvements in terms of Recall and MAP over some known methods as shown in our comprehensive experiments. Moreover, our approach is general which can be combined with PQ and its variations. Due to the unbalanced variances of the means among different subspaces, we have developed an adaptive subspace mean-removed product quantization (ASMRPQ) to achieve better performance.
What problem does this paper attempt to address?