Human-in-the-loop Reinforcement Learning for Data Quality Monitoring in Particle Physics Experiments

Olivia Jullian Parra,Julián García Pardiñas,Lorenzo Del Pianta Pérez,Maximilian Janisch,Suzanne Klaver,Thomas Lehéricy,Nicola Serra
2024-05-24
Abstract:Data Quality Monitoring (DQM) is a crucial task in large particle physics experiments, since detector malfunctioning can compromise the data. DQM is currently performed by human shifters, which is costly and results in limited accuracy. In this work, we provide a proof-of-concept for applying human-in-the-loop Reinforcement Learning (RL) to automate the DQM process while adapting to operating conditions that change over time. We implement a prototype based on the Proximal Policy Optimization (PPO) algorithm and validate it on a simplified synthetic dataset. We demonstrate how a multi-agent system can be trained for continuous automated monitoring during data collection, with human intervention actively requested only when relevant. We show that random, unbiased noise in human classification can be reduced, leading to an improved accuracy over the baseline. Additionally, we propose data augmentation techniques to deal with scarce data and to accelerate the learning process. Finally, we discuss further steps needed to implement the approach in the real world, including protocols for periodic control of the algorithm's outputs.
High Energy Physics - Experiment,Machine Learning
What problem does this paper attempt to address?
This paper discusses the importance of Data Quality Monitoring (DQM) in large-scale particle physics experiments and the current issues that exist. Currently, DQM is mainly performed by manual operators, which is both expensive and may result in limited classification accuracy. The paper proposes a human-in-the-loop reinforcement learning (RL) approach to automate the DQM process and adapt to changing operating conditions over time. They implemented a prototype based on the Proximal Policy Optimization (PPO) algorithm and validated it on a simplified artificial dataset. The study shows that multi-agent systems can be trained to continuously monitor the conditions during data collection and request human intervention only when necessary. Through this approach, random and unbiased noise in human classification can be reduced, improving accuracy. In addition, the paper proposes data augmentation techniques to address data scarcity issues and accelerate the learning process. Future implementation of this approach will require further steps, including regular checks on algorithm output. The paper divided the experiments into online and offline phases and discussed the advantages of RL in adapting to changing conditions and improving efficiency. The experimental results demonstrate that RL can reduce dependence on human resources and improve the efficiency and accuracy of DQM. In summary, the paper aims to address how to use reinforcement learning to automate data quality monitoring in particle physics experiments, while adapting to evolving operational conditions, reducing limitations of manual operation, and improving the accuracy and efficiency of data processing.