The regression for the redshifts of galaxies in SDSS DR18
Wen Xiao-Qing,Yin Hong-Wei,Liu Feng-Hua,Yang Shang-Tao,Zhu Yi-Rong,Yang Jin-Meng,Su Zi-Jie,Guan Bing
DOI: https://doi.org/10.1016/j.cjph.2024.05.045
IF: 3.957
2024-06-17
Chinese Journal of Physics
Abstract:Highlights • There was rarely such massive data to validate the efficiency of algorithms. • The LightGBM, XGBoost, Catboost, RF was used to regress the redshifts of galaxies. • We got MAE∼0.018, MSE∼0.001, RMSE∼0.033, MAPE∼0.146, R2∼0.909 for our Sample 1. • We got MAE∼0.044, MSE∼0.007, RMSE∼0.083, MAPE∼0.149, R2∼0.895 for our Sample 2. We used nine machine learning methodologies to regress the redshifts of galaxies from the SDSS DR18 combined with the ALLWISE. The sample 1 contained 862,708 galaxies with the u, g, r, i, z, J, H, Ks, W1, W2, W3, W4 observed magnitudes. The sample 2 contained 2,445,923 galaxies with the u, g, r, i, z, W1, W2, W3, W4 observed magnitudes. The nine machine learning methodologies were: the LightGBM, XGBoost, Catboost, RF, DF, DT, KNN, GBDT and SVR algorithms. Considering the least time consuming and the best results, LightGBM was the best. In the LightGBM, we got MAE ∼ 0.018, MSE ∼ 0.001, root mean squared error (RMSE) ∼ 0.033, MAPE ∼ 0.146, R 2 ∼ 0.909, bias ∼ 0.000233, σ ∼ 0.033389, σMAD ∼ 0.0181, outliers fraction ∼ 0.0028, the ratio of |∆ z | > 2 σ ∼ 0.0245, biasnorm ∼ -0.000404, RMSEnorm ∼ 0.025, σNMAD ∼ 0.015774 and run time ∼ 31.021 s for our Sample 1. We got MAE ∼ 0.044, MSE ∼ 0.007, RMSE ∼ 0.083, MAPE ∼ 0.149, R 2 ∼ 0.895, bias ∼ 0.000489, σbias ∼ 0.083065, σMAD ∼ 0.0352, outliers fraction ∼ 0.0186, the ratio of |∆ z | > 2 σ ∼ 0.0396, biasnorm ∼ -0.00222, RMSEnorm ∼ 0.053, σNMAD ∼ 0.026093 and run time ∼ 61.847 s for our Sample 2. Our results were a little better than the others' studies, originating from the better algorithms or more data. The machine learning algorithms, such as the LightGBM, were comparable methodologies with SED fit until the redshift to 1.
physics, multidisciplinary