Beta Rank Function: A Smooth Double-Pareto-Like Distribution

Oscar Fontanelli,Pedro Miramontes,Ricardo Mansilla,Germinal Cocho,Wentian Li
DOI: https://doi.org/10.48550/arXiv.1910.05364
2019-10-12
Abstract:The Beta Rank Function (BRF) $x(u) =A(1-u)^b/u^a$, where $u$ is the normalized and continuous rank of an observation $x$, has wide applications in fitting real-world data from social science to biological phenomena. The underlying probability density function (pdf) $f_X(x)$ does not usually have a closed expression except for specific parameter values. We show however that it is approximately a unimodal skewed and asymmetric two-sided power law/double Pareto/log-Laplacian distribution. The BRF pdf has simple properties when the independent variable is log-transformed: $f_{Z=\log(X)}(z)$ . At the peak it makes a smooth turn and it does not diverge, lacking the sharp angle observed in the double Pareto or Laplace distribution. The peak position of $f_Z(z)$ is $z_0=\log A+(a-b)\log(\sqrt{a}+\sqrt{b})-(a\log(a)-b\log(b))/2 $; the probability is partitioned by the peak to the proportion of $\sqrt{b}/(\sqrt{a}+\sqrt{b})$ (left) and $\sqrt{a}/(\sqrt{a}+\sqrt{b})$ (right); the functional form near the peak is controlled by the cubic term in the Taylor expansion when $a\ne b$; the mean of $Z$ is $E[Z]=\log A+a-b$; the decay on left and right sides of the peak is approximately exponential with forms $e^{\frac{z-\log A}{b} }/b$ and $e^{ -\frac{z-\log A}{a}}/a$. These results are confirmed by numerical simulations. Properties of $f_X(x)$ without log-transforming the variable are much more complex, though the approximate double Pareto behavior, $(x/A)^{1/b}/(bx)$ (for $x<A$) and $(x/A)^{-1/a}/(ax)$ (for $x > A$) is simple. Our results elucidate the relationship between BRF and log-normal distributions when $a=b$ and explain why the BRF is ubiquitous and versatile. Based on the pdf, we suggest a quick way to elucidate if a real data set follows a one-sided power-law, a log-normal, a two-sided power-law or a BRF. We illustrate our results with two examples: urban populations and financial returns.
Methodology
What problem does this paper attempt to address?