cortecs: A Python package for compressing opacities

Arjun B. Savel,Megan Bedell,Eliza M.-R. Kempton
2024-08-06
Abstract:The absorption and emission of light by exoplanet atmospheres encode details of atmospheric composition, temperature, and dynamics. Fundamentally, simulating these processes requires detailed knowledge of the opacity of gases within an atmosphere. When modeling broad wavelength ranges at high resolution, such opacity data, for even a single gas, can take up multiple gigabytes of system random-access memory (RAM). This aspect can be a limiting factor when considering the number of gases to include in a simulation, the sampling strategy used for inference, or even the architecture of the system used for calculations. Here, we present cortecs, a Python tool for compressing opacity data. cortecs provides flexible methods for fitting the temperature, pressure, and wavelength dependencies of opacity data and for evaluating the opacity with accelerated, GPU-friendly methods. The package is actively developed on GitHub (<a class="link-external link-https" href="https://github.com/arjunsavel/cortecs" rel="external noopener nofollow">this https URL</a>), and it is available for download with pip and conda.
Instrumentation and Methods for Astrophysics,Earth and Planetary Astrophysics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in high - resolution spectral simulations, due to the fact that atmospheric opacity data occupies a large amount of memory (RAM/VRAM), the computational resources are limited, thereby restricting the types of gases that can be included in the simulation, the sampling strategy, and the choice of computational architecture. Specifically: - **Memory Occupancy Problem**: When simulating a wide wavelength range and high resolution, even the opacity data of a single gas may occupy several gigabytes of system memory. This is especially problematic for modern high - performance computing (such as GPU - accelerated computing), because the video random - access memory (VRAM) capacity of GPUs is limited. - **Computational Efficiency Problem**: Large - volume opacity data not only occupies memory but also affects the computational speed, especially when a large number of wavelength points need to be processed. To solve these problems, the author has developed a Python toolkit named **cortecs** for compressing opacity data. By compressing the opacity data, cortecs can significantly reduce memory occupancy and improve computational efficiency, so that high - resolution spectral simulations can be carried out on a wider range of computing platforms, while supporting more types of gases and more complex models. ### Specific Problem Description 1. **Excessive Memory Occupancy**: - High - resolution spectral simulations need to process a large number of wavelength points (for example, tens of thousands), which makes the opacity data files very large. - The opacity data file of a single gas may reach 2.1 GB (stored with float64 precision), and as the temperature, pressure, and wavelength points increase, the data volume will further increase. 2. **Limited Computational Resources**: - The VRAM of GPUs is limited, usually only a few tens of gigabytes, and the most advanced GPUs (such as NVIDIA A100 or H100) only have more than 30 GB of VRAM. - Such hardware limitations make it difficult to process a large number of gases and wavelength points in high - resolution spectral simulations. ### Solution cortecs compresses opacity data through the following methods: - **Polynomial Fitting**: Use polynomial functions to fit opacity data to reduce the data volume. - **Principal Component Analysis (PCA)**: Extract the main features through dimension - reduction techniques to reduce redundant information. - **Neural Network**: Use a neural network model to learn and compress opacity data. These methods can significantly reduce memory occupancy while maintaining relatively high precision, thereby making high - resolution spectral simulations more efficient and feasible. ### Example Application The paper shows the application of cortecs in a specific case: parameter retrieval of the high - resolution thermal emission spectrum of the hot Jupiter WASP - 77Ab. The results show that when using the compressed opacity data for simulation, the obtained posterior distribution and Bayesian evidence are consistent with the results of uncompressed data, and the calculation time is similar. This indicates that the compression and decompression schemes of cortecs are accurate enough in some high - resolution spectral simulations. In summary, this paper aims to solve the problem of opacity data occupying a large amount of memory in high - resolution spectral simulations by developing the cortecs toolkit, thereby improving computational efficiency and feasibility.