Assessing the reliability of ensemble forecasting systems under serial dependence

Jochen Bröcker
DOI: https://doi.org/10.1002/qj.3379
2018-10-01
Quarterly Journal of the Royal Meteorological Society
Abstract:The problem of testing the reliability of ensemble forecasting systems is revisited. A popular tool to assess the reliability of ensemble forecasting systems (for scalar verifications) is the rank histogram; this histogram is expected to be more or less flat, since, for a reliable ensemble, the ranks are uniformly distributed among their possible outcomes. Quantitative tests for flatness (e.g. Pearson's goodness‐of‐fit test) have been suggested; without exception, however, these tests assume the ranks to be a sequence of independent random variables, which is not the case in general, as can be demonstrated with simple toy examples. In this article, tests are developed that take the temporal correlations between the ranks into account. A refined analysis exploiting the reliability property shows that the ranks still exhibit strong decay of correlations. This property is key to the analysis, and the proposed tests are valid for general ensemble forecasting systems with minimal extraneous assumptions. Typical rank histograms for an ensemble forecasting system for short resp long lead times (top and bottom panels, resp). Both forecast systems are by construction reliable, but the bottom histogram exhibits stronger variations. This is due to correlations between the ranks at larger lead times, which are not independent. A new type of Goodness‐Of‐Fit test is developed which takes these correlations into account. No further extraneous assumptions are needed except that the ranks have to form an ergodic sequence.
meteorology & atmospheric sciences
What problem does this paper attempt to address?