FLEX: A Fast and Light-weight Learned Index for kNN Search in High-Dimensional Space

Lingli Li,Ao Han,Xiaotong Cui,Baohua Wu
DOI: https://doi.org/10.1016/j.ins.2024.120546
IF: 8.1
2024-04-06
Information Sciences
Abstract:The k Nearest Neighbors ( k NN) search in high-dimensional space is a fundamental problem with various applications. In this paper, we try to solve this problem by using deep neural networks (DNNs). We apply DNNs to represent complex correlations between high-dimensional objects, enabling us to project similar objects into the same class, thus reducing the search cost. Based on DNNs, we propose two novel techniques to improve query efficiency while keeping accuracy. First, traditional DNNs typically demand extensive training data for achieving high accuracy. To decrease the training size, we design a multi-module DNN framework comprising several small modules. Each module learns to capture part of knowledge for the given query. The collective output of these sub-modules is then seamlessly integrated to form the final result. Second, with machine learning models, the size of candidates are unbounded. Thus, we design a linear-time data layout refinement algorithm, aiming to limit the number of candidates to a small constant. Empirically we find that our approach significantly outperforms the state-of-the-art methods in terms of both time efficiency and space efficiency while still attaining comparable or better accuracy.
computer science, information systems
What problem does this paper attempt to address?