Quick large-scale high-dimensional data retrieval method and system

Wang Jianmin,Long Mingsheng,Cao Yue,Liu Bin
2018-01-01
Abstract:The invention provides a method and system for approximate nearest neighbor retrieval of large-scale high-dimensional data based on product quantization and multi-reverse indexing. The method comprises the steps that binary codes corresponding to data to be retrieved are obtained based on a trained product quantization unit, wherein the binary codes are used for determining a clustering center nearest to the data to be retrieved; the binary codes are input into a multi-reverse indexing unit matched with the trained product quantization unit, and a set composed of data nearest to the data to beretrieved in a preset database is obtained; according to the distance between each piece of data in the set and the data to be retrieved, all the data in the set is sorted, and all the sorted data serves as retrieval result. The large-scale similarity retrieval method and system based on high-dimensional data can greatly improve the retrieval accuracy and time efficiency.
What problem does this paper attempt to address?