Repeated out of Sample Fusion in the Estimation of Small Tail Probabilities

Benjamin Kedem,Lemeng Pan,Paul Smith,Chen Wang
DOI: https://doi.org/10.48550/arXiv.1803.10766
2019-07-23
Abstract:Often, it is required to estimate the probability that a quantity such as toxicity level, plutonium, temperature, rainfall, damage, wind speed, wave size, earthquake magnitude, risk, etc., exceeds an unsafe high threshold. The probability in question is then very small. To estimate such a probability, information is needed about large values of the quantity of interest. However, in many cases, the data only contain values below or even far below the designated threshold, let alone exceedingly large values. It is shown that by repeated fusion of the data with externally generated random data, more information about small tail probabilities is obtained with the aid of certain new statistical functions. This provides relatively short, yet reliable interval estimates based on moderately large samples. A comparison of the approach with a method from extreme values theory (Peaks over Threshold, or POT), using both artificial and real data, points to the merit of repeated out of sample fusion.
Methodology
What problem does this paper attempt to address?