Abstract:A study of four different machine learning (ML) algorithms is performed to determine the most suitable ML technique to disentangle a hypothetical supersymmetry signal from its corresponding Standard Model (SM) backgrounds and to establish their impact on signal significance. The study focuses on the production of SUSY top squark pairs (stops), in the mass range of $500<m_{\tilde{t}_1}<800$ GeV, from proton-proton collisions with a center of mass energy of 13 TeV and an integrated luminosity of 150 fb$^{-1}$, emulating the data-taking conditions of the run II LHC accelerator. In particular, the semileptonic channel is analyzed, corresponding to final states with a single isolated lepton (electron or muon), missing transverse energy, and four jets, with at least one tagged as $b$-jet. The challenging compressed spectra region is targeted, where the stop decays mainly into a $W$ boson, a $b$-jet, and a neutralino ($\tilde{t}_1\rightarrow W+b+\tilde{\chi}_1^0$), with a mass gap between the stop and the neutralino of about 150 GeV. The ML algorithms are chosen to cover different mathematical implementations and features in machine learning. We compare the performance of a logistic regression (LR), a Random Forest (RF), an XGBoost (XG), and a Neural Network (NN) algorithm. Our results indicate that all four algorithms provide an improvement in signal significance calculation when compared to the ones obtained with a standard analysis method based on sequential requirements of different kinematic variables. The highest gain in significance is obtained with the NN and XG classifiers with an average improvement over 20\%, both having compatible statistical performance for the stop mass range considered, followed by RF(15\%). The LR has the poorest performance of all ML algorithms studied, but still presents an average improvement of about 4\%.

Parameter Inference from Event Ensembles and the Top-Quark Mass

Machine learning approaches for parameter reweighting in Monte-Carlo samples of top quark production in CMS

Machine learning approaches for parameter reweighting in MC samples of top quark production in CMS

An Optimal Scheme For Top Quark Mass Measurement Near The Tt Threshold At Future E(+)E(-) Colliders

An implementation of neural simulation-based inference for parameter estimation in ATLAS

Top Quark Mass Extractions from Energy Correlators: A Feasibility Study

A New Paradigm for Precision Top Physics: Weighing the Top with Energy Correlators

Precision Top Mass Measurement Using Energy Correlators

An Optimal Scheme for Top Quark Mass Measurement Near the $\rm{t}\bar{t}$ Threshold at Future $\Rm{e}^{+}{e}^{-}$ Colliders

A Holistic Approach to Predicting Top Quark Kinematic Properties with the Covariant Particle Transformer

Top quark physics in the Large Hadron Collider era

Determination of the top-quark mass from top-quark pair events with the matrix element method at next-to-leading order: Potential and prospects

Top-quark pole mass extraction at NNLO accuracy, from total, single- and double-differential cross sections for $t\bar{t}+X$ production at the LHC

Top squark signal significance enhancement by different Machine Learning Algorithms

Top-quark Mass Determination from T-Channel Single Top Production at the LHC

Top-quark pole mass extraction at NNLO accuracy, from total, single- and double-differential cross sections for + X production at the LHC

Prospect of measuring the top quark mass through energy correlators

When the Machine Chimes the Bell: Entanglement and Bell Inequalities with Boosted $t\bar{t}$

Entanglement and Bell inequalities with boosted

Accuracy versus precision in boosted top tagging with the ATLAS detector