Accelerating Machine Learning Inference with GPUs in ProtoDUNE Data Processing

Tejin Cai,Kenneth Herner,Tingjun Yang,Michael Wang,Maria Acosta Flechas,Philip Harris,Burt Holzman,Kevin Pedro,Nhan Tran
DOI: https://doi.org/10.1007/s41781-023-00101-0
2023-10-28
Abstract:We study the performance of a cloud-based GPU-accelerated inference server to speed up event reconstruction in neutrino data batch jobs. Using detector data from the ProtoDUNE experiment and employing the standard DUNE grid job submission tools, we attempt to reprocess the data by running several thousand concurrent grid jobs, a rate we expect to be typical of current and future neutrino physics experiments. We process most of the dataset with the GPU version of our processing algorithm and the remainder with the CPU version for timing comparisons. We find that a 100-GPU cloud-based server is able to easily meet the processing demand, and that using the GPU version of the event processing algorithm is two times faster than processing these data with the CPU version when comparing to the newest CPUs in our sample. The amount of data transferred to the inference server during the GPU runs can overwhelm even the highest-bandwidth network switches, however, unless care is taken to observe network facility limits or otherwise distribute the jobs to multiple sites. We discuss the lessons learned from this processing campaign and several avenues for future improvements.
High Energy Physics - Experiment,Distributed, Parallel, and Cluster Computing,Data Analysis, Statistics and Probability
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to improve the speed and efficiency of event reconstruction by using GPU - accelerated machine - learning inference in large - scale data processing. Specifically, the researchers are concerned with whether using cloud - based GPU - accelerated inference servers can significantly reduce the event reconstruction time when processing data from the ProtoDUNE experiment, especially when facing the future demand for higher data volumes, whether this technology can meet the requirements for rapid processing. In addition, the paper also explores how to reasonably set task concurrency limits in the case of limited network resources to avoid network saturation while maximizing the advantages of GPU acceleration. These challenges are particularly important for current and future neutrino physics experiments because they need to process large amounts of data quickly and efficiently, especially when a supernova explosion within or near the Milky Way is detected, rapid reconstruction of detector trigger records is crucial for providing rapid location information to optical telescopes.