Development and Applications of Boosted Boson Jet Tagging Methods in the CMS Experiment
Leyun Gao,Dawei Fu,Yuzhe Zhao,Qilong Guo,Sitian Qian,Tianyi Yang,Sen Deng,Qiang Li
DOI: https://doi.org/10.1360/tb-2023-1120
2024-01-01
Abstract:Boosted boson jet tagging methods have been of great interest in high energy physics (HEP) theoretical and experimental research in the past decades and become driving forces and innovation targets in the high energy frontier. Boosted bosons are usually decay products of massive resonances and are widely studied in search of new particles and new physics beyond the standard model (BSM). For boosted gauge bosons (V) and Higgs bosons (H), the CMS Collaboration successfully developed tagging techniques and calibration methods for processes like V -> qq and H -> bb and has successfully applied them to searches for BSM processes like multi-boson resonances. This paper reviews the development and applications of boosted boson tagging methods in the CMS experiment in the past decade, including the latest proposed techniques and algorithms, especially those based heavily on deep learning techniques since the dawn of the artificial intelligence era. Traditional tagging methods mainly focus on characteristic jet substructures. They work well because the signal processes usually have typical jet substructures distinguishable from the background. Quantitative expressions of jet substructures are defined as signal-background determinants, such as N-subjettiness and energy correlation functions (ECFs), which indicate the prong number of a jet, and soft-dropped four-momentum, which is usually more precise than the raw one. In the near past, with the introduction of machine learning techniques, boosted boson tagging methods have made a great breakthrough. Successful experiences of the machine learning applications in industry are learnt to make powerful and generalizable machine learning jet tagging models. As an example, convolutional neural networks are famous for their outstanding information extraction power on images and sequences. Representing a jet as an image in the eta-phi (pseudorapidity-azimuthal) space or sequences of particles, 2D or 1D CNNs are then applicable for jet classification. For example, DeepAK8 is a great step forward in the boosted boson tagging field inspired by ResNet, the state-of-the-art image classification model at that time, to use residual deep convolution architecture. One more example can be taken on ParticleNet inspired by DGCNN. In the context of high energy physics, symmetries can be biased into the neural network models to gain more efficiency and accuracy, and mathematics are also important to reduce the computational cost, which is the philosophy of LorentzNet, one of the successors of ParticleNet for boosted jet tagging purpose. After LorentzNet, Particle Transformer (ParT) came into the world with the emergence of the Transformer architecture and will be used more and more in future CMS analysis works. Appropriate calibration methods are developed for the applications of new boosted boson tagging methods in the CMS experiment. They are usually designed for specific tagging methods. For example, to calibrate H -> c (c) over bar tagging, proxy methods are developed to select QCD events that are similar enough to the signal process. Last but not least, three concrete examples are given to reveal the heavy uses of the boosted boson tagging and calibration methods in the CMS experiment. A heavy di-boson resonance, a heavy tri-boson resonance, and the H -> c (c) over bar search results are published by the CMS experiment successively. The boosted boson tagging and calibration methods used in those analysis works evolved from the traditional jet substructure methods to the DeepAK8 neural network method and then to the ParticleNet neural network.