Using Galaxy Evolution as Source of Physics-Based Ground Truth for Generative Models

Yun Qi Li,Tuan Do,Evan Jones,Bernie Boscoe,Kevin Alfaro,Zooey Nguyen
2024-07-10
Abstract:Generative models producing images have enormous potential to advance discoveries across scientific fields and require metrics capable of quantifying the high dimensional output. We propose that astrophysics data, such as galaxy images, can test generative models with additional physics-motivated ground truths in addition to human judgment. For example, galaxies in the Universe form and change over billions of years, following physical laws and relationships that are both easy to characterize and difficult to encode in generative models. We build a conditional denoising diffusion probabilistic model (DDPM) and a conditional variational autoencoder (CVAE) and test their ability to generate realistic galaxies conditioned on their redshifts (galaxy ages). This is one of the first studies to probe these generative models using physically motivated metrics. We find that both models produce comparable realistic galaxies based on human evaluation, but our physics-based metrics are better able to discern the strengths and weaknesses of the generative models. Overall, the DDPM model performs better than the CVAE on the majority of the physics-based metrics. Ultimately, if we can show that generative models can learn the physics of galaxy evolution, they have the potential to unlock new astrophysical discoveries.
Instrumentation and Methods for Astrophysics,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to use generative models to generate realistic galaxy images in astrophysics and evaluate the capabilities of these models through physics - based metrics. Specifically, the paper explores how to use galaxy evolution as a physical - based real - data source for generative models to test the performance of generative models when generating galaxy images conditioned on redshift (i.e., galaxy age). The paper points out that although existing human - perception - based metrics (such as IS and FID) can well evaluate the quality of generated images, they may overlook scientifically important features, such as the relationship between galaxy size distribution and redshift. Therefore, the authors propose new physics - based metrics, such as galaxy fitting loss, galaxy KL loss and redshift loss, to more comprehensively evaluate the performance of generative models. ### Main contributions of the paper: 1. **Proposing physics - based metrics**: In addition to traditional human - perception - based metrics (such as IS and FID), the authors introduce new physics - based metrics, such as galaxy fitting loss, galaxy KL loss and redshift loss, to more accurately evaluate the performance of generative models. 2. **Constructing generative models**: The authors construct a conditional denoising diffusion probability model (DDPM) and a conditional variational auto - encoder (CVAE), and test their capabilities when generating galaxy images conditioned on redshift. 3. **Evaluating model performance**: Through multiple metrics, the authors conduct a detailed evaluation of DDPM and CVAE, and find that DDPM performs better on most physics - based metrics, especially in the case of high redshift. ### Main methods: - **Dataset**: Use 286,401 galaxy images from the Hyper Suprime - Cam survey, covering a redshift range from 0 to 4. - **Generative models**: - **Conditional denoising diffusion probability model (DDPM)**: By adding a conditional mechanism, the model can generate galaxy images according to redshift. - **Conditional variational auto - encoder (CVAE)**: Use an encoder and a decoder, both conditioned on the redshift of the galaxy. - **Metrics**: - **Galaxy fitting loss**: Measures the irregularity of the generated galaxy. - **Galaxy KL loss**: Compares the physical parameter distributions of the generated galaxy and the real galaxy. - **Redshift loss**: Measures the difference between the redshift in the generated image and the conditional redshift. ### Results: - **Visual evaluation**: The images generated by DDPM are more visually realistic, and the background properties are closer to the real images, while the images generated by CVAE are prone to extended structures and background correlations at high redshift. - **Quantitative evaluation**: At low redshift (< 0.5), CVAE performs slightly better; while at high redshift (> 0.5), DDPM performs significantly better than CVAE, especially in terms of the physical property distribution of the generated galaxies. ### Conclusion: By introducing physics - based metrics, the authors demonstrate the capabilities of generative models when generating galaxy images conditioned on redshift, and prove that DDPM is superior to CVAE on most metrics. These results indicate that generative models can not only generate visually realistic galaxy images, but also capture the physical characteristics of galaxy evolution, thus having potential application value in astrophysics.