Parallel Data Mining on Graphics Processors

Wenbin Fang,Ka Keung Lau,Mian Lu,Xiangye Xiao,Chi Kit Lam,Philip Yang Yang,Bingsheng He,Qiong Luo,Pedro V. Sander,Ke Yang
2008-01-01
Abstract:We introduce GPUMiner, a novel parallel data mining system that utilizes new-generation graphics processing units (GPUs). Our sys- tem relies on the massively multi-threaded SIMD (Single Instruc- tion, Multiple-Data) architecture provided by GPUs. As special- purpose co-processors, these processors are highly optimized for graphics rendering and rely on the CPU for data input/output as well as complex program control. Therefore, we design GPUMiner to consist of the following three components: (1) a CPU-based storage and buffer manager to handle I/O and data transfer be- tween the CPU and the GPU, (2) a GPU-CPU co-processing paral- lel mining module, and (3) a GPU-based mining visualization mod- ule. We design the GPU-CPU co-processing scheme in mining de- pending on the complexity and inherent parallelism of individual mining algorithms. We provide the visualization module to facil- itate users to observe and interact with the mining process online. We have implemented the k-means clustering and the Apriori fre- quent pattern mining algorithms in GPUMiner. Our preliminary results have shown significant speedups over state-of-the-art CPU implementations on a PC with a G80 GPU and a quad-core CPU. We will demonstrate the mining process through our visualization module. Code and documentation of GPUMiner are available at http://code.google.com/p/gpuminer/.
What problem does this paper attempt to address?