Abstract:Time-domain astronomy is progressing rapidly with the ongoing and upcoming large-scale photometric sky surveys led by the Vera C. Rubin Observatory project (LSST). Billions of variable sources call for better automatic classification algorithms for light curves. Among them, periodic variable stars are frequently studied. Different categories of periodic variable stars have a high degree of class imbalance and pose a challenge to algorithms including deep learning methods. We design two kinds of architectures of neural networks for the classification of periodic variable stars in the Catalina Survey's Data Release 2: a multi-input recurrent neural network (RNN) and a compound network combing the RNN and the convolutional neural network (CNN). To deal with class imbalance, we apply Gaussian Process to generate synthetic light curves with artificial uncertainties for data augmentation. For better performance, we organize the augmentation and training process in a "bagging-like" ensemble learning scheme. The experimental results show that the better approach is the compound network combing RNN and CNN, which reaches the best result of 86.2% on the overall balanced accuracy and 0.75 on the macro F1 score. We develop the ensemble augmentation method to solve the data imbalance when classifying variable stars and prove the effectiveness of combining different representations of light curves in a single model. The proposed methods would help build better classification algorithms of periodic time series data for future sky surveys (e.g., LSST).

What problem does this paper attempt to address?

The paper primarily addresses the issue of classifying periodic variable stars in astronomy, particularly how to handle the impact of data imbalance on deep learning algorithms. Specifically, the main challenges faced by the researchers include: 1. **Data Imbalance Problem**: The number of samples for different types of periodic variable stars varies greatly, leading to machine learning models that tend to favor the majority class while ignoring the minority class. 2. **Application of Deep Learning Methods**: Although traditional machine learning methods have many techniques for handling imbalanced data, such techniques are still immature in the field of deep learning, especially when dealing with light curve data in astrophysics. To address the above issues, the authors proposed the following solutions: - Designed two neural network architectures: a multi-input Recurrent Neural Network (RNN) and a composite network combining RNN and Convolutional Neural Network (CNN) to utilize different types of input information for classification. - Used Gaussian processes to generate synthetic light curves to increase the amount of data for the minority class while preserving uncertainty information. - Implemented an ensemble learning method based on the "bagging" concept, by constructing multiple sub-datasets and training different neural network models, then averaging the results of these models to improve overall classification performance and mitigate overfitting issues. Through experimental evaluation, the authors found that the composite network combining RNN and CNN achieved an overall balanced accuracy of 86.2% and a macro F1 score of 0.75, showing significant improvement compared to using only RNN or other data augmentation methods. Additionally, this approach demonstrated the effectiveness of combining light curves in different representations within a single model, providing strong support for the classification of periodic time series data in future large-scale sky survey projects (such as LSST).

Periodic Variable Star Classification with Deep Learning: Handling Data Imbalance in an Ensemble Augmentation Way

A Novel Approach for Variable Star Classification Based on Imbalanced Learning

Scalable End-to-end Recurrent Neural Network for Variable star classification

Application of Convolutional Neural Networks to time domain astrophysics. 2D image analysis of OGLE light curves

Hierarchical Classification of Variable Stars Using Deep Convolutional Neural Networks

Semi-supervised classification and clustering analysis for variable stars

Transfer Learning Applied to Stellar Light Curve Classification

Classification of Periodic Variable Stars with Novel Cyclic-Permutation Invariant Neural Networks

Light curve classification with recurrent neural networks for GOTO: dealing with imbalanced data

Automatic Survey-Invariant Variable Star Classification

A Package for the Automated Classification of Periodic Variable Stars

Identifying Light-curve Signals with a Deep-learning-based Object Detection Algorithm. II. A General Light-curve Classification Framework

Deep Learning and LLM-based Methods Applied to Stellar Lightcurve Classification

Identifying Light-curve Signals with a Deep Learning Based Object Detection Algorithm. II. A General Light Curve Classification Framework

Multi-Class Deep SVDD: Anomaly Detection Approach in Astronomy with Distinct Inlier Categories

Deep-Learnt Classification of Light Curves

LEAVES: An Expandable Light-curve Data Set for Automatic Classification of Variable Stars

Advanced Astroinformatics for Variable Star Classification

Calibrating Long Period Variables as Standard Candles with Machine Learning

Astronomical image time series classification using CONVolutional attENTION (ConvEntion)

Results of the Photometric LSST Astronomical Time-series Classification Challenge (PLAsTiCC)