Flexible Heavy Tailed Distributions for Big Data

Yuanyuan Zhang,Saralees Nadarajah
DOI: https://doi.org/10.1007/s40745-017-0113-4
2017-01-01
Annals of Data Science
Abstract:The Pareto type I distribution (also known as the power law distribution and Zipf’s law) appears to be the main distribution used to model heavy tailed phenomena in the big data literature. The Pareto type I distribution being one of the oldest heavy tailed distributions is not very flexible. Here, we show flexibility of four other heavy tailed distributions for modeling four big data sets in social networks. The Pareto type I distribution is shown not to provide the best or even an adequate fit for any of the data sets.
What problem does this paper attempt to address?