A heterogeneous CPU-GPU implementation for discrete elements simulation with multiple GPUs

Yuan Tian, Junjie Lai, Lei Yang, Ji Qi, Qingguo Zhou
2013-11-02
Abstract:To calculate the large number of particles in discrete elements simulation, a heterogeneous CPU-GPU implementation with multiple GPUs is developed. The implementation is achieved by combining two different parallel programming languages so that it can be assigned to a CPU-GPU cluster. The communication between nodes uses Massage Passing Interface (MPI) implementation for dynamic domain decomposition, particles re-mapping and data copying of overlapping areas. Other works are assigned to GPUs to obtain a high computational speed. The results of strong and weak scalability tests are analyzed for different number of GPUs. Last, the LAMMPS is used as CPU platform to compare with multi-GPU application for reflecting the superiority of using heterogeneous implementation.
What problem does this paper attempt to address?