cecilia: A Machine Learning-Based Pipeline for Measuring Metal Abundances of Helium-rich Polluted White Dwarfs

M. Badenas-Agusti,J. Viaña,A. Vanderburg,S. Blouin,P. Dufour,S. Xu,L. Sha
2024-02-08
Abstract:Over the past several decades, conventional spectral analysis techniques of polluted white dwarfs have become powerful tools to learn about the geology and chemistry of extrasolar bodies. Despite their proven capabilities and extensive legacy of scientific discoveries, these techniques are however still limited by their manual, time-intensive, and iterative nature. As a result, they are susceptible to human errors and are difficult to scale up to population-wide studies of metal pollution. This paper seeks to address this problem by presenting cecilia, the first Machine Learning (ML)-powered spectral modeling code designed to measure the metal abundances of intermediate-temperature (10,000$\leq T_{\rm eff} \leq$20,000 K), Helium-rich polluted white dwarfs. Trained with more than 22,000 randomly drawn atmosphere models and stellar parameters, our pipeline aims to overcome the limitations of classical methods by replacing the generation of synthetic spectra from computationally expensive codes and uniformly spaced model grids, with a fast, automated, and efficient neural-network-based interpolator. More specifically, cecilia combines state-of-the-art atmosphere models, powerful artificial intelligence tools, and robust statistical techniques to rapidly generate synthetic spectra of polluted white dwarfs in high-dimensional space, and enable accurate ($\lesssim$0.1 dex) and simultaneous measurements of 14 stellar parameters -- including 11 elemental abundances -- from real spectroscopic observations. As massively multiplexed astronomical surveys begin scientific operations, cecilia's performance has the potential to unlock large-scale studies of extrasolar geochemistry and propel the field of white dwarf science into the era of Big Data. In doing so, we aspire to uncover new statistical insights that were previously impractical with traditional white dwarf characterisation techniques.
Instrumentation and Methods for Astrophysics,Earth and Planetary Astrophysics,Solar and Stellar Astrophysics,Machine Learning
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to address the limitations faced by traditional spectral analysis techniques in studying the metallicity of helium - rich polluted white dwarfs. Specifically, although traditional spectral analysis methods are powerful and have a rich history in scientific discovery, they have the following shortcomings: 1. **Manual, time - consuming and highly iterative**: These methods rely on manual operations, are prone to human error, and are difficult to scale up for large - scale sample studies. 2. **High computational cost**: Traditional methods for generating synthetic spectra require a large amount of computational resources, especially when dealing with multi - dimensional parameter spaces. To solve these problems, the author has developed a machine - learning (ML) - based spectral modeling code - **cecilia**, specifically designed to measure the metallicity of helium - rich polluted white dwarfs in the intermediate temperature range (10,000 ≤ 𝑇_{eff} ≤ 20,000 K). By training more than 22,000 randomly sampled atmospheric models and stellar parameters, cecilia can quickly and automatically generate synthetic spectra and achieve accurate measurement (precision ≲ 0.1 dex) of 14 stellar parameters (including 11 elemental abundances). ### Main objectives - **Improve efficiency**: By using neural networks instead of traditional computationally intensive codes and uniformly distributed model grids, cecilia can quickly generate synthetic spectra in high - dimensional spaces. - **Reduce human error**: The automated process reduces the errors caused by manual operations. - **Support large - scale research**: With the development of large - scale astronomical survey projects, the performance of cecilia is expected to unlock large - scale research in exogeochemistry and promote white dwarf scientific research into the big - data era. ### Method overview Cecilia combines state - of - the - art atmospheric models, powerful artificial intelligence tools and robust statistical techniques to achieve rapid and accurate analysis of polluted white dwarf spectra. At its core is a deep neural network architecture, which consists of three main parts: 1. **Autoencoder**: Used to compress and reconstruct input data and identify the main features. 2. **Fully - connected neural network (FCNN1)**: Used to generate spectra of polluted white dwarfs. 3. **Fine - tuned fully - connected neural network (FT FCNN2)**: Used to predict stellar parameters from spectra. Through this method, cecilia not only improves the analysis speed and accuracy, but also provides a flexible infrastructure for future improvements.