DPG: A Cache-Efficient Accelerator for Sorting and for Join Operators

Gene Cooperman,Xiaoqin Ma,Viet Ha Nguyen
DOI: https://doi.org/10.48550/arXiv.cs/0308004
2003-08-02
Abstract:We present a new algorithm for fast record retrieval, distribute-probe-gather, or DPG. DPG has important applications both in sorting and in joins. Current main memory sorting algorithms split their work into three phases: extraction of key-pointer pairs; sorting of the key-pointer pairs; and copying of the original records into the destination array according the sorted key-pointer pairs. The copying in the last phase dominates today's sorting time. Hence, the use of DPG in the third phase provides an accelerator for existing sorting algorithms. DPG also provides two new join methods for foreign key joins: DPG-move join and DPG-sort join. The resulting join methods with DPG are faster because DPG join is cache-efficient and at the same time DPG join avoids the need for sorting or for hashing. The ideas presented for foreign key join can also be extended to faster record pair retrieval for spatial and temporal databases.
Databases,Data Structures and Algorithms
What problem does this paper attempt to address?