Multi-GPU accelerated cellular automaton model for simulating the solidification structure of continuous casting bloom
Jingjing Wang,Hongji Meng,Jian Yang,Zhi Xie
DOI: https://doi.org/10.1007/s11227-022-04839-z
IF: 3.3
2022-01-01
The Journal of Supercomputing
Abstract:The continuous casting bloom is characterized by a large size and long process, leading to tremendous calculation. It takes a long time to simulate the solidification structure by the traditional sequential algorithm on the CPU which cannot satisfy the industrial demand for guiding the process. This study developed a multi-GPU-based cellular automaton model to accelerate the calculation. Firstly, a heterogeneous GPU-CA parallel algorithm was developed to optimize the calculation parallelism by eliminating the data dependency and data race among cells, where the capture process adopted a random-principle-based arbitration mechanism to determine which neighbor obtains the final capture right. Then, the multi-stream communication scheme was developed to overlap the calculation of the inner region and the data transferring and calculation of the halo region, hiding the overhead of data exchange between GPUs. Finally, the present model was validated by the analytical LGK model value, and it was applied to simulate the solidification structure of GCr15 in a certain steel plant. The simulation result shows a clear solidification structure with different crystal zone of columnar, equiaxed, and where the columnar transfers into equiaxed (CET). The proportion of crystal zone agrees well with the low-power images from field experiments with relative errors of 0.032%, 0.013%, and 0.025%. Also, the multi-GPU application can calculate the temperature distribution during the solidification process with the maximum relative error of 0.013% compared to the field data. Furthermore, in the case of owning the almost same calculation precision as a single-core CPU, the speedup of the present model is up to 700x, whereas the speedup of the CPU with 20 cores is only about 14.2x.