Performance Analysis of Join Algorithms on GPUs

Ran Rui,Hao Li,Yi-Cheng Tu
2015-01-01
Abstract:Implementing database operations on parallel platforms has gain a lot of momentum in the past decade, due to the increasing popularity of many-core processors. A number of studies have shown the potential of using GPUs to speed up database operations. In this paper, we present empirical evaluations of a state-of-the-art work published in SIGMOD’08 on GPU-based join processing. In particular, such work provides four major join algorithms and a number of join-related primitives. Since 2008, the compute capabilities of GPUs have increased following a pace faster than that of the multi-core CPUs. We run a comprehensive set of experiments to study how join operations can benefit from such rapid expansion of GPU capabilities. Our experiments on today’s mainstream GPU and CPU hardware show that the GPU join program achieves up to 20X speedup over a highly-optimized CPU version. This is significantly better than the 7X performance gap reported in the original paper. We also modify the GPU programs to take advantage of new GPU hardware/software features such as read-only data cache, large L2 cache, and shuffle instructions. By applying such optimizations, extra performance improvement of 30-52% is observed in various components of the GPU program. Finally, we evaluate the same program from a few other perspectives including energy efficiency, floating-point performance, and program development considerations to further reveal the advantages and limitations of using GPUs for database operations. In summary, we find that today’s GPUs are significantly faster in floating point operations, can process more on-board data, and achieves higher energy efficiency than modern CPUs. The availability of new tools and models have made program development and optimization on GPUs much easier than before.
Computer Science
What problem does this paper attempt to address?