MSQL+: a plugin toolkit for similarity search under metric spaces in distributed relational database systems
Wei Lu,Xinyi Zhang,Zhiyu Shui,Zhe Peng,Xiao Zhang,Xiaoyong Du,Hao Huang,Xiaoyu Wang,Anqun Pan,Haixiang Li
DOI: https://doi.org/10.14778/3229863.3236237
2018-01-01
Abstract:AbstractSimilarity search is a primitive operation in various database applications. Thus far, a large number of access methods have been proposed to accelerate the similarity query processing. Nonetheless, these methods mostly focus on developing standalone systems by proposing new indices. Given the fact that existing RDBMS merely support traditional indices, it is of great necessity and practical importance to develop a standard RDBMS built-in index based approach to speeding up the query processing. In this demonstration, we introduce MSQL+, a plugin toolkit that enable users to answer similarity queries in metric spaces simply using standard SQL statements. This toolkit can help existing RDBMS to effectively and efficiently handle with big data due to the following three advantages. First, MSQL+ enables users to find similar objects by submitting SELECT-FROM-WHERE statements so that it can be easily integrated into existing RDBMS. Second, MSQL+ works in a more general data space. Objects of any type can be indexed by B+-trees and the query processing can be boosted by using index seeks, as long as the similarity function is metric. Third, MSQL+ supports the parallelization of both pre-processing and query processing in distributed RDBMS.