A Novel Cell Partition Method by Introducing Silhouette Coefficient for Fast Approximate Nearest Neighbor Search

Wenwen Song,Yang Wang,Zhibin Pan
DOI: https://doi.org/10.1016/j.ins.2023.119216
IF: 8.1
2023-05-26
Information Sciences
Abstract:In this paper, a novel cell partition method is proposed to accelerate the approximate nearest neighbor (ANN) search. The traditional cell-level elimination of database vectors, such as the inverted file system (IVF), has limitations on accelerating the search process. Therefore, the Silhouette Coefficient is introduced to realize more accurate and effective vector-level pruning. The value of the Silhouette Coefficient is used as a novel criterion for dividing the Voronoi cell into the center region and the border region. In addition, a dynamic threshold selection model is introduced to determine the sampling rate for each cell adaptively. The proposed method can greatly improve the search speed by reducing the vector-level distance calculation. Besides, the value of the Silhouette Coefficient is saved by offline generating a rank in each Voronoi cell, which does not bring any extra memory cost. Through experiments, it can be seen that for two widely-used datasets of SIFT1M and GIST1M, with no more than 1% accuracy reduction, the search speed can be significantly increased by about 44% compared with the standard inverted file system. And compared with state-of-the-art ANN search methods that aim at acceleration, the proposed method can also achieve higher search accuracy with almost the same computational consumption.
computer science, information systems
What problem does this paper attempt to address?