Top squark signal significance enhancement by different Machine Learning Algorithms
Fraga Jorge,Rodriguez Ronald,Solano Jesus,Molano Juan,Avila Carlos
DOI: https://doi.org/10.48550/arXiv.2106.06813
2022-08-10
Abstract:A study of four different machine learning (ML) algorithms is performed to determine the most suitable ML technique to disentangle a hypothetical supersymmetry signal from its corresponding Standard Model (SM) backgrounds and to establish their impact on signal significance. The study focuses on the production of SUSY top squark pairs (stops), in the mass range of $500<m_{\tilde{t}_1}<800$ GeV, from proton-proton collisions with a center of mass energy of 13 TeV and an integrated luminosity of 150 fb$^{-1}$, emulating the data-taking conditions of the run II LHC accelerator. In particular, the semileptonic channel is analyzed, corresponding to final states with a single isolated lepton (electron or muon), missing transverse energy, and four jets, with at least one tagged as $b$-jet. The challenging compressed spectra region is targeted, where the stop decays mainly into a $W$ boson, a $b$-jet, and a neutralino ($\tilde{t}_1\rightarrow W+b+\tilde{\chi}_1^0$), with a mass gap between the stop and the neutralino of about 150 GeV. The ML algorithms are chosen to cover different mathematical implementations and features in machine learning. We compare the performance of a logistic regression (LR), a Random Forest (RF), an XGBoost (XG), and a Neural Network (NN) algorithm. Our results indicate that all four algorithms provide an improvement in signal significance calculation when compared to the ones obtained with a standard analysis method based on sequential requirements of different kinematic variables. The highest gain in significance is obtained with the NN and XG classifiers with an average improvement over 20\%, both having compatible statistical performance for the stop mass range considered, followed by RF(15\%). The LR has the poorest performance of all ML algorithms studied, but still presents an average improvement of about 4\%.
High Energy Physics - Phenomenology