A GPU-accelerated Monte Carlo code, RT2for coupled transport of photon, electron/positron, and neutron

Chang-Min Lee,Sung-Joon Ye
DOI: https://doi.org/10.1088/1361-6560/ad694f
2024-08-14
Abstract:Objective.This work aims to develop a graphics processing unit (GPU)-accelerated Monte Carlo code for the coupled transport of photon, electron/positron and neutron over a broad range of energies for medical applications.Approach.By separating the MC evolution of radiation into source, transport, and interaction kernels, the branch divergence was alleviated. The memory coalescence was achieved by vectorizing the access pattern in which the secondary particles were archived. To accelerate further particle tracking, ray-tracing hardware acceleration in the Nvidia OptiXTMframework was applied. For photon and electron/positron, the EGSnrc interaction modules were ported as a GPU-optimized configuration. For neutron, a group-wised transport based on NJOY21 preprocessed data was implemented. The developed code was validated against CPU-based FLUKA. Neutron, x-ray and electron beams incident on water and ICRP phantoms were simulated. The neutron energy group and the transport parameters of photon and electron were set to be the same in both codes. A single Nvidia RTX 4090 card was used in this code while all 20 threads of a single Intel Core i9-10900K node were used in FLUKA.Main results.The number of histories was set to ensure that statistical uncertainties lower than 2% for all voxels whose doses were larger than 20% of the maximum. In all cases, the dose differences in the voxels between the codes were within 2.5%. For photons and electrons, the developed code was 150-300 times faster than FLUKA in both geometries. For neutrons, the code was respectively 80 and 135 times faster in the water and ICRP phantoms than FLUKA.Significance.This study offers an appropriate solution for uncoalesced memory access and branch divergence commonly encountered in coupled MC transport on the GPU architecture. The formidable acceleration in computing times and accuracy shown in this study can promise a routine clinical use of MC simulations.
What problem does this paper attempt to address?