Cats vs Dogs, Photons vs Hadrons

Francesco Visconti
DOI: https://doi.org/10.1007/978-3-031-34167-0_37
2022-12-16
Abstract:In gamma ray astronomy with Cherenkov telescopes, machine learning models are needed to guess what kind of particles generated the detected light, and their energies and directions. The focus in this work is on the classification task, training a simple convolutional neural network suitable for binary classification (as it could be a cats vs dogs classification problem), using as input uncleaned images generated by Montecarlo data for a single ASTRI telescope. Results show an enhanced discriminant power with respect to classical random forest methods.
Instrumentation and Methods for Astrophysics,High Energy Astrophysical Phenomena
What problem does this paper attempt to address?
The problem that this paper attempts to solve is in gamma - ray astronomy, how to use machine - learning models to distinguish which particles (photons or hadrons) produce the light detected by Cherenkov telescopes and estimate the energy and direction of these particles. Specifically, the author focuses on the classification task and trains a simple convolutional neural network (CNN) to distinguish between photons and hadrons. ### Problem Background In ground - based gamma - ray astronomy, existing instruments such as H.E.S.S., MAGIC and VERITAS have demonstrated great physical potential in the teraelectronvolt (TeV) energy range. The ASTRI telescope is part of the Astrofisica con Specchi a Tecnologia Replicante Italiana project, belongs to the small - size telescopes (SST) of the Cherenkov Telescope Array (CTA), has a diameter of 4 meters and adopts a two - mirror technology. The ASTRI data - processing software is developed by a specialized team of the Italian National Institute for Astrophysics (INAF) and is mainly used for signal - to - background discrimination. ### Research Objectives The author's goal is to distinguish photon and hadron events by training a convolutional neural network (CNN) using uncleaned Monte Carlo simulation images directly as input. The traditional random - forest method depends on Hillas parameters, which are calculated after image cleaning and may introduce bias. The deep - learning method can directly process the original image data and avoid this bias. ### Methods The author used 70,000 calibrated images from the ASTRI telescope for training, 30,000 for validation, and finally made predictions on 100,000 images. These images were generated before event cleaning and retained the influence of the night - sky background (NSB). The CNN architecture used is shown in the following table: | Layer (Type) | Output Shape | Number of Parameters | |----|----|----| | Conv2D | (None, 54, 54, 32) | 896 | | MaxPooling2D | (None, 27, 27, 32) | 0 | | Conv2D | (None, 25, 25, 32) | 9,248 | | MaxPooling2D | (None, 12, 12, 32) | 0 | | Conv2D | (None, 10, 10, 64) | 18,496 | | MaxPooling2D | (None, 5, 5, 64) | 0 | | Dense | (None, 64) | 102,464 | | Dropout | (None, 64) | 0 | | Dense | (None, 1) | 65 | Total number of parameters: 131,169 trainable parameters. ### Results The training process took about 1 hour on an NVIDIA K20 GPU and stopped after 11 epochs. The highest accuracy on the validation set reached 96.7%, which is significantly better than 86% of the traditional random - forest method. Other evaluation metrics (such as the area under the ROC curve AUC, F1 - score, Brier loss, etc.) also show that the CNN performs better in distinguishing photons and hadrons. ### Summary This work demonstrates the superior performance of using a small convolutional neural network in the gamma - hadron classification problem and enhances the discrimination ability. This indicates that the bias introduced by event parameterization is very important, and future research can further explore the application of stereo images (i.e., events observed by multiple telescopes) and improve the reconstruction of energy and direction.