Combining statistical learning with deep learning for improved exoplanet detection and characterization

Olivier Flasseur,Théo Bodrito,Julien Mairal,Jean Ponce,Maud Langlois,Anne-Marie Lagrange
2024-09-20
Abstract:In direct imaging at high contrast, the bright glare produced by the host star makes the detection and the characterization of sub-stellar companions particularly challenging. In spite of the use of an extreme adaptive optics system combined with a coronagraphic mask to strongly attenuate the starlight contamination, dedicated post-processing methods combining several images recorded with the pupil tracking mode of the telescope are needed to reach the required contrast. In that context, we recently proposed to combine the statistics-based model of PACO with a deep learning approach in a three-step algorithm. First, the data are centered and whitened locally using the PACO framework to improve the stationarity and the contrast in a preprocessing step. Second, a convolutional neural network (CNN) is trained in a supervised fashion to detect the signature of synthetic sources in the preprocessed science data. Finally, the trained network is applied to the preprocessed observations and delivers a detection map. A second network is trained to infer locally the photometry of detected sources. Both deep models are trained from scratch with a custom data augmentation strategy allowing to generate a large training set from a single spatio-temporo-spectral dataset. This strategy can be applied to process jointly the images of observations conducted with angular, and eventually spectral, differential imaging (A(S)DI). In this proceeding, we present in a unified framework the key ingredients of the deep PACO algorithm both for ADI and ASDI. We apply our method on several datasets from the the IRDIS imager of the VLT/SPHERE instrument. Our method reaches, in average, a better trade-off between precision and recall than the comparative algorithms.
Instrumentation and Methods for Astrophysics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in high - contrast direct imaging, due to the intense glare of the host star, it is extremely difficult to detect and characterize sub - stellar companions (such as exoplanets). Although extreme adaptive optics systems and coronagraphs are used to significantly reduce starlight interference, specialized post - processing methods are still required to further improve the contrast in order to achieve the required detection sensitivity. Specifically, although the existing PACO algorithm performs well in capturing local spatial correlations, there are still certain approximation errors in the fitting of its statistical model to the observational data, resulting in room for improvement in the detection sensitivity at short angular separations. For this reason, the author proposes a new method that combines statistical learning and deep learning - **deep PACO** - to further improve the detection sensitivity of exoplanets. ### Main problem summary: 1. **The problem of strong glare in high - contrast imaging**: The intense glare of the host star makes it very difficult to detect sub - stellar companions. 2. **Limitations of existing methods**: Existing post - processing methods (such as cADI, PCA, TLOCI, etc.) have limited detection sensitivity at short angular separations. 3. **The need for improvement of the PACO algorithm**: Although the PACO algorithm performs well, there is still room for improvement at short angular separations because its statistical model does not fit the observational data precisely enough. ### Solutions: - **Combining statistical learning and deep learning**: A new three - step algorithm is developed by combining the statistical model of PACO with deep learning. 1. **Pre - processing**: Use the PACO framework to perform local centering and whitening on the data to improve stationarity and contrast. 2. **Training a convolutional neural network (CNN)**: Train a CNN through supervised learning to detect the characteristics of synthetic sources. 3. **Applying the trained network**: Apply the trained network to the pre - processed observational data to generate a detection map and estimate the luminosity of the detected sources. ### Formula presentation: - **Observation model**: \[ r_\ell = f_\ell+\sum_{p = 1}^{P}\alpha_{p,\ell}h_\ell(\phi_p) \] where \(r_\ell\) is the observed image of the \(\ell\) - th spectral channel, \(f_\ell\) is the noise component, \(h_\ell(\phi_p)\) is the point spread function (PSF), \(\alpha_{p,\ell}\) is the intensity of the source, and \(\phi_p\) is the position of the source. - **Statistical modeling**: - Estimation of the sample covariance matrix \(\hat{S}_n\): \[ \hat{S}_n=\frac{1}{TL}\sum_{t,\ell}\hat{\sigma}_{n,t,\ell}^{-2}(r_{n,t,\ell}-\hat{m}_{n,\ell})(r_{n,t,\ell}-\hat{m}_{n,\ell})^\top \] - Regularized covariance matrix \(\hat{C}_n\): \[ \hat{C}_n=(1 - \hat{\rho}_n)\hat{S}_n+\hat{\rho}_n\hat{F}_n \] where \(\hat{F}_n\) is a diagonal matrix containing the sample variances. - **Loss function**: - Use the Dice loss function for training: \[ L[s]=1-\frac{\sum_m y_m^{[s]}\hat{y}_m^{[s]}+\epsilon}{\sum_m y_m^{[s]}+\hat{y}_m^{[s]}+\epsilon}-\frac{\sum_m(1 - y_m^{[s]})(1 - \hat{y}_m^{[s]}+\epsilon)}{\sum_m 2 - y_m^{[s]}-\hat{y}_m^{[s]}+\epsilon} \]