Finetuning Foundation Models for Joint Analysis Optimization

Matthias Vigl,Nicole Hartman,Lukas Heinrich
2024-01-26
Abstract:In this work we demonstrate that significant gains in performance and data efficiency can be achieved in High Energy Physics (HEP) by moving beyond the standard paradigm of sequential optimization or reconstruction and analysis components. We conceptually connect HEP reconstruction and analysis to modern machine learning workflows such as pretraining, finetuning, domain adaptation and high-dimensional embedding spaces and quantify the gains in the example usecase of searches of heavy resonances decaying via an intermediate di-Higgs system to four $b$-jets.
High Energy Physics - Experiment,Machine Learning,High Energy Physics - Phenomenology,Data Analysis, Statistics and Probability
What problem does this paper attempt to address?
This paper investigates an issue in high-energy physics (HEP) data analysis, which is that traditional step-by-step optimization methods may not be optimal for data analysis pipelines. The study found that significant improvements in performance and data efficiency can be achieved by leveraging modern large-scale machine learning (ML) workflows such as pre-training, fine-tuning, domain adaptation, and high-dimensional embedding space. Specifically, the paper proposes combining HEP reconstruction and analysis with concepts like pre-training and fine-tuning, and quantifies these gains in an example case study of searching for resonant decays to a di-Higgs system. Traditionally, HEP data analysis adopts a hierarchical pattern recognition and inference approach, where low-level patterns are first identified and then progressively reconstructed and analyzed. However, the paper points out that this step-by-step optimization strategy may fail to obtain the global optimum. In the study, the authors demonstrate that performance can be improved and sample size can be reduced to enhance data efficiency through global gradient-based optimization strategies. The main contributions of the paper include: 1. Establishing correspondences between HEP analysis workflows and modern deep learning concepts, such as base models, downstream tasks, and fine-tuning. 2. Demonstrating end-to-end optimization in the setting of particle physics, including fine-tuning of object representations and event-level inference. 3. Quantifying significant improvements in data efficiency and performance with fixed sample size through end-to-end optimization. 4. Providing evidence of successful domain adaptation when fine-tuning HEP base models on non-pretrained datasets. Related works include studies on optimizing HEP analysis and handling low-level variables using deep learning. The paper also highlights the similarities between HEP analysis and machine learning approaches based on pretrained base models and proposes a generic strategy for optimizing HEP data analysis pipelines. Experimental results show that fine-tuning strategies outperform traditional HEP methods in terms of performance and data efficiency.