Stacking machine learning classifiers to identify Higgs bosons at the LHC

Alexandre Alves
DOI: https://doi.org/10.1088/1748-0221/12/05/T05005
2017-05-31
Abstract:Machine learning (ML) algorithms have been employed in the problem of classifying signal and background events with high accuracy in particle physics. In this paper, we compare the performance of a widespread ML technique, namely, \emph{stacked generalization}, against the results of two state-of-art algorithms: (1) a deep neural network (DNN) in the task of discovering a new neutral Higgs boson and (2) a scalable machine learning system for tree boosting, in the Standard Model Higgs to tau leptons channel, both at the 8 TeV LHC. In a cut-and-count analysis, \emph{stacking} three algorithms performed around 16\% worse than DNN but demanding far less computation efforts, however, the same \emph{stacking} outperforms boosted decision trees. Using the stacked classifiers in a multivariate statistical analysis (MVA), on the other hand, significantly enhances the statistical significance compared to cut-and-count in both Higgs processes, suggesting that combining an ensemble of simpler and faster ML algorithms with MVA tools is a better approach than building a complex state-of-art algorithm for cut-and-count.
High Energy Physics - Phenomenology,Machine Learning,Data Analysis, Statistics and Probability
What problem does this paper attempt to address?