Photometric Redshift Estimation Using Gaussian Processes
D. G. Bonfield,Y. Sun,N. Davey,M. J. Jarvis,F. B. Abdalla,M. Banerji,R. G. Adams
DOI: https://doi.org/10.1111/j.1365-2966.2010.16544.x
IF: 4.8
2010-01-01
Monthly Notices of the Royal Astronomical Society
Abstract:We present a comparison between Gaussian processes (GPs) and artificial neural networks (ANNs) as methods for determining photometric redshifts for galaxies, given training-set data. In particular, we compare their degradation in performance as the training-set size is degraded in ways which might be caused by the observational limitations of spectroscopy. Using publicly available regression codes, we find that performance with large, complete training sets is very similar, although the ANN achieves slightly smaller rms errors. Training sets with brighter magnitude limits than the test data do not strongly affect the performance of either algorithm, until the limits are so severe that they remove almost all of the high-redshift training objects. Similarly, the introduction of a plausible number (up to 10 per cent) of inaccurate redshifts into the training set has little effect on either method. However, if the size of the training set is reduced by random sampling, the rms errors of both methods increase, but they do so to a lesser extent and in a much smoother manner for the case of GP regression; for the example presented ANNZ has rms errors similar to 20 per cent worse than GP regression in the small training-set limit. Also, when training objects are removed at redshifts 1.3 < z < 1.7, to simulate the effects of the 'redshift desert' of optical spectroscopy, the GP regression is successful at interpolating across the redshift gap, while the ANN suffers from strong bias for test objects in this redshift range. Overall, GP regression has attractive properties for photometric redshift estimation, particularly for deep, high-redshift surveys where it is difficult to obtain a large, complete training set. At present, unlike the ANN code, public GP regression codes do not take account of inhomogeneous measurement errors on the photometric data, and thus cannot estimate reliable uncertainties on the predicted redshifts. However, a better treatment of errors is in principle possible, and the promising results in this paper suggest that such improved GP algorithms should be pursued.