Performance Optimization Of A Dem Simulation Framework On Gpu Using A Stencil Model

Ran Xue,Yuxin Wang,He Guo,Chi Zhang,Shunying Ji
DOI: https://doi.org/10.1007/978-3-319-32557-6_11
2016-01-01
Abstract:High performance and efficiency for parallel computing has significance in large scale discrete element method (DEM) simulation. After analyzing a simulation framework of DEM built on a Graphic Processor Unit (GPU) platform with CUDA architecture and evaluating the simulated data, we propose three optimization methods to improve the performance of a system. A stencil computation model is applied to the particle searching and calculation of forces based on gridding to formulate the structure in the particle-particle contact and neighboring particle searching. In addition, a reasonable and effective parallel granularity is sought out by altering the number of blocks and threads on GPU. A shared-memory environment is set up for data prefetching and storing the results of intermediate calculations by a rational analysis and calculations. The results of the experiment show that the stencil model is useful for the particle searching and calculation of forces and the rational parallel granularity as well as the fair use of shared memory optimizes the performance of the DEM simulation frame-work.
What problem does this paper attempt to address?