Improving Earth-like planet detection in radial velocity using deep learning

Yinan Zhao,Xavier Dumusque,Michael Cretignier,Andrew Collier Cameron,David W. Latham,Mercedes López-Morales,Michel Mayor,Alessandro Sozzetti,Rosario Cosentino,Isidro Gómez-Vargas,Francesco Pepe,Stephane Udry
2024-05-22
Abstract:Many novel methods have been proposed to mitigate stellar activity for exoplanet detection as the presence of stellar activity in radial velocity (RV) measurements is the current major limitation. Unlike traditional methods that model stellar activity in the RV domain, more methods are moving in the direction of disentangling stellar activity at the spectral level. The goal of this paper is to present a novel convolutional neural network-based algorithm that efficiently models stellar activity signals at the spectral level, enhancing the detection of Earth-like planets. We trained a convolutional neural network to build the correlation between the change in the spectral line profile and the corresponding RV, full width at half maximum (FWHM) and bisector span (BIS) values derived from the classical cross-correlation function. This algorithm has been tested on three intensively observed stars: Alpha Centauri B (HD128621), Tau ceti (HD10700), and the Sun. By injecting simulated planetary signals at the spectral level, we demonstrate that our machine learning algorithm can achieve, for HD128621 and HD10700, a detection threshold of 0.5 m/s in semi-amplitude for planets with periods ranging from 10 to 300 days. This threshold would correspond to the detection of a $\sim$4$\mathrm{M}_{\oplus}$ in the habitable zone of those stars. On the HARPS-N solar dataset, our algorithm is even more efficient at mitigating stellar activity signals and can reach a threshold of 0.2 m/s, which would correspond to a 2.2$\mathrm{M}_{\oplus}$ planet on the orbit of the Earth. To the best of our knowledge, it is the first time that such low detection thresholds are reported for the Sun, but also for other stars, and therefore this highlights the efficiency of our convolutional neural network-based algorithm at mitigating stellar activity in RV measurements.
Earth and Planetary Astrophysics,Instrumentation and Methods for Astrophysics,Machine Learning
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to address the main challenge encountered in detecting Earth - like planets using the Radial Velocity (RV) technique - the influence of stellar activity on RV measurements. Specifically: 1. **Interference from stellar activity**: Stellar activity is caused by complex physical processes that occur on different time scales, such as granule motion, super - granule motion, changes in sunspot and facula regions, and the magnetic field cycle of the star. These activities can lead to noise in RV measurements, making the detection of low - mass planets extremely difficult. 2. **Limitations of traditional methods**: Traditional RV data analysis methods mainly focus on modeling stellar activity in the RV domain, but this method often fails to effectively distinguish between planetary signals and stellar activity signals. In recent years, some studies have begun to attempt to separate stellar activity at the spectral level to improve the accuracy of planet detection. 3. **Application of deep learning**: The paper proposes a new algorithm based on the Convolutional Neural Network (CNN), which can efficiently model stellar activity signals at the spectral level, thereby enhancing the detection ability of Earth - like planets. ### Research objectives The objective of the paper is to present a new algorithm based on the Convolutional Neural Network, which can effectively model stellar activity signals at the spectral level, thereby improving the detection efficiency of Earth - like planets. ### Methods - **Data pre - processing**: The paper first pre - processes the spectral data and converts it into a normalized flux - flux gradient space (shell spectral representation) to reduce the data dimension and maximize the stellar activity information caused by the suppression of convective blueshift. - **Training data generation**: Increase the number of training samples through cross - validation techniques to overcome the problem of limited data for different star types. - **Neural network architecture**: Use a Convolutional Neural Network to perform regression tasks. The input is a 10×10 shape shell, and the output is RV, FWHM, and BIS values. To prevent the model from absorbing planetary signals, the calcium activity index log(R'HK) is also introduced as part of the output. ### Results - **Test stars**: The algorithm was tested on three widely - observed stars: Alpha Centauri B (HD 128621), Tau Ceti (HD 10700), and the Sun. - **Detection thresholds**: For HD 128621 and HD 10700, the algorithm can reach a semi - amplitude detection threshold of 0.5 m/s, corresponding to a planet of about 4 Earth masses with a period between 10 and 300 days. On the HARPS - N solar data set, due to the larger amount of data, the algorithm can reach a detection threshold of 0.2 m/s, corresponding to a planet of 2.2 Earth masses in an orbit similar to that of the Earth. ### Conclusions As far as the authors know, this is the first report of such a low detection threshold, which is applicable not only to the Sun but also to other stars. This highlights the effectiveness of the CNN - based algorithm in mitigating stellar activity in RV measurements. ### Keywords Method: Data analysis; Technique: Radial velocity; Technique: Spectroscopy; Star: Activity ### Formulas - **Flux change formula**: \[ f(\lambda_i)-f_0(\lambda_i)=\frac{df_0(\lambda_i)}{d\lambda_i}\cdot\delta\lambda_i = \frac{df_0(\lambda_i)}{d\lambda_i}\cdot\frac{\delta V_i}{c}\cdot\lambda_i \] \[ \delta V_i=\frac{c}{\lambda_i}\cdot(f(\lambda_i)-f_0(\lambda_i))\cdot\left(\frac{df_0(\lambda_i)}{d\lambda_i}\right)^{-1} \] - **Doppler effect formula**: \[ \delta V_i =