Probabilistic Forward Modeling of Galaxy Catalogs with Normalizing Flows

John Franklin Crenshaw,J. Bryce Kalmbach,Alexander Gagliano,Ziang Yan,Andrew J. Connolly,Alex I. Malz,Samuel J. Schmidt,LSST Dark Energy Science Collaboration
2024-05-08
Abstract:Evaluating the accuracy and calibration of the redshift posteriors produced by photometric redshift (photo-z) estimators is vital for enabling precision cosmology and extragalactic astrophysics with modern wide-field photometric surveys. Evaluating photo-z posteriors on a per-galaxy basis is difficult, however, as real galaxies have a true redshift but not a true redshift posterior. We introduce PZFlow, a Python package for the probabilistic forward modeling of galaxy catalogs with normalizing flows. For catalogs simulated with PZFlow, there is a natural notion of "true" redshift posteriors that can be used for photo-z validation. We use PZFlow to simulate a photometric galaxy catalog where each galaxy has a redshift, noisy photometry, shape information, and a true redshift posterior. We also demonstrate the use of an ensemble of normalizing flows for photo-z estimation. We discuss how PZFlow will be used to validate the photo-z estimation pipeline of the Dark Energy Science Collaboration (DESC), and the wider applicability of PZFlow for statistical modeling of any tabular data.
Instrumentation and Methods for Astrophysics,Cosmology and Nongalactic Astrophysics
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of how to evaluate and calibrate the accuracy and reliability of redshift posterior distributions produced by photometric redshift (photo - z) estimators. Specifically, the authors introduce a new method - PZFlow, which uses Normalizing Flows for probabilistic forward - modeling of galaxy catalogs. This method can provide a "true" redshift posterior distribution for each simulated galaxy, making it possible to verify photometric redshift estimates at the level of individual galaxies. #### Background and Challenges Modern wide - field photometric surveys (such as the Vera C. Rubin Observatory's Legacy Survey of Space and Time, LSST) generate a large number of galaxy catalogs, and the galaxy colors in these catalogs are used to estimate their possible redshift posterior distributions. However, it is very difficult to verify the accuracy and calibration of these posterior distributions because although real galaxies have true redshifts, they do not have "true" redshift posterior distributions. Therefore, traditional evaluation methods cannot meet the needs of modern surveys, and new methods are required to evaluate the performance of photometric redshift estimates. #### The Role of PZFlow PZFlow is a Python - based toolkit that uses normalizing flow techniques to generate simulated galaxy catalogs with true redshift posterior distributions. In this way, researchers can compare the estimated redshift posterior distributions with the "true" redshift posterior distributions at the level of individual galaxies, thereby more accurately evaluating the performance of photometric redshift estimators. In addition, PZFlow can also be used as a photometric redshift estimator and is suitable for statistical modeling of any tabular data. #### Specific Implementation 1. **Constructing the Normalizing Flow Model**: PZFlow can generate simulated galaxy catalogs with true redshift posterior distributions by learning the reversible mapping from complex data distributions to simple latent distributions. 2. **Simulating Galaxy Catalogs**: PZFlow can simulate photometric galaxy catalogs that include redshifts, photometric measurements with noise, shape information, and true redshift posterior distributions. 3. **Photometric Redshift Estimation**: PZFlow can not only be used to generate simulated data, but also directly as a density estimator for estimating photometric redshifts. Through these methods, PZFlow provides a powerful verification tool for photometric redshift estimation, which helps to improve the reliability of precision cosmology and galaxy evolution research. #### Summary In general, this paper solves the difficult problem of verifying redshift posterior distributions in photometric redshift estimation by introducing PZFlow, especially the verification at the level of individual galaxies. This will help to improve the accuracy and reliability of photometric redshift estimates, thereby promoting the development of precision cosmology and galaxy evolution research.