Unleashing Graph Partitioning for Large-Scale Nearest Neighbor Search

Lars Gottesbüren,Laxman Dhulipala,Rajesh Jayaram,Jakub Lacki
2024-03-04
Abstract:We consider the fundamental problem of decomposing a large-scale approximate nearest neighbor search (ANNS) problem into smaller sub-problems. The goal is to partition the input points into neighborhood-preserving shards, so that the nearest neighbors of any point are contained in only a few shards. When a query arrives, a routing algorithm is used to identify the shards which should be searched for its nearest neighbors. This approach forms the backbone of distributed ANNS, where the dataset is so large that it must be split across multiple machines.
Data Structures and Algorithms,Information Retrieval
What problem does this paper attempt to address?