On a Minimum Distance Procedure for Threshold Selection in Tail Analysis

Holger Drees,Anja Janen,Sidney I. Resnick,Tiandong Wang
DOI: https://doi.org/10.1137/19m1260463
2020-01-01
SIAM Journal on Mathematics of Data Science
Abstract:Power-law distributions have been widely observed in different areas of scientific research. Practical estimation issues include selecting a threshold above which observations follow a power-law distribution and then estimating the power-law tail index. A minimum distance selection procedure (MDSP) proposed by Clauset, Shalizi, and Newman [SIAM Rev., 51 (2009), pp. 661--703] has been widely adopted in practice for the analyses of social networks. However, theoretical justifications for this selection procedure remain scant. In this paper, we study the asymptotic behavior of the selected threshold and the corresponding power-law index given by the MDSP. For independent and identically distributed (iid) observations with Pareto-like tails, we derive the limiting distribution of the chosen threshold and the power-law index estimator, where the latter estimator is not asymptotically normal. We deduce that in this iid setting MDSP tends to choose too high a threshold level and show with asymptotic analysis and simulations how the variance increases compared to Hill estimators based on a nonrandom threshold. We also provide simulation results for dependent preferential attachment network data and find that the performance of the MDSP procedure is highly dependent on the chosen model parameters.
What problem does this paper attempt to address?