Efficient Top-K Query Processing Algorithms in Highly Distributed Environments

Qiming Fang,Guangwen Yang
DOI: https://doi.org/10.4304/jcp.9.9.2000-2006
2014-01-01
Journal of Computers
Abstract:Efficient top-k query processing in highly distributed environments is a valuable but challenging research topic. This paper focuses on the problem over vertically partitioned data and aims to propose more efficient algorithms.. The effort is put on limiting the data transferred and communication round trips among nodes to reduce the communication cost of the query processing. Two novel algorithms, BulkDBPA and 4RUT, are proposed. BulkDBPA is derived from the centralized algorithm BPA2 which requires very low data access. BulkDBPA borrows the idea of best position from BPA2 and so has the advantage of low data transferred. It further reduces the communication round trips by utilizing bulk read and bulk transfer mechanism. 4RUT is inspired by the algorithm TPUT which only requires three communication round trips to get the exact top-k results. 4RUT improves its top-k lower bound estimate by introducing one additional communication round trip, which can subsequently reduce the data transferred in query processing. Experimental results show that both BulkDBPA and 4RUT require much less data transferred and response time than the competitors including Simple Algorithm and TPUT and each has its own suitable application environments respectively.
What problem does this paper attempt to address?