Distributed file column storage indexing method

Qin Xiongpai,Chen Yueguo,Du Xiaoyong,Zhao Liping
2018-01-01
Abstract:The invention discloses a distributed file column storage indexing method. The method comprises the following steps of analyzing a query sentence to obtain query conditions; according to an indexing field in the query conditions, reading an indexing copy column, wherein the indexing copy column is constituted after the indexing field is constituted by duplicating after ranking a Stripe of a columnstorage engine and comprises a column value and column deviation of the column value; according to the indexing copy column, obtaining a Row Group which needs to be read finally. Duplicated and ranked data can be used as a direct index to read the data, thus the magnetic disk input and output expenditure is reduced, the query performance is optimized, the ranked data can also be used as an indirect index to reduce data reading of other fields, the query efficiency is further optimized, indexing on the finest storage particle size can be achieved, and the performance of particle size query isimproved.
What problem does this paper attempt to address?