Tail universalities in rank distributions as an algebraic problem: the beta-like function

Gerardo G. Naumis,Germinal Cocho
DOI: https://doi.org/10.1016/j.physa.2007.08.002
2007-05-04
Abstract:Although power laws of the Zipf type have been used by many workers to fit rank distributions in different fields like in economy, geophysics, genetics, soft-matter, networks etc., these fits usually fail at the tails. Some distributions have been proposed to solve the problem, but unfortunately they do not fit at the same time both ending tails. We show that many different data in rank laws, like in granular materials, codons, author impact in scientific journal, etc. are very well fitted by a beta-like function. Then we propose that such universality is due to the fact that a system made from many subsystems or choices, imply stretched exponential frequency-rank functions which qualitatively and quantitatively can be fitted with the proposed beta-like function distribution in the limit of many random variables. We prove this by transforming the problem into an algebraic one: finding the rank of successive products of a given set of numbers.
Data Analysis, Statistics and Probability,General Physics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the poor fit of the power law (such as Zipf's law) in the tail of ranking distributions in different fields. Although the power law is widely used to fit ranking distributions in fields such as economy, geology, genetics, soft matter, and networks, these fittings usually fail in the tails. Some distributions have been proposed to solve this problem, but they cannot fit both tails simultaneously. The author found that many different data, such as granular materials, codons, impact factors of scientific journals, etc., are very well fitted by a form similar to the beta function. Therefore, the author proposed a hypothesis that this universality is due to the fact that a system composed of multiple subsystems or selections will lead to stretched - exponential frequency - ranking functions, and these functions can be qualitatively and quantitatively fitted by the proposed beta - like function distribution in the limit of many random variables. Specifically, the core problems of the paper are: 1. **Poor fit of the power law in the tail**: When the traditional power law model is used to describe ranking distributions, it performs poorly in the tail (i.e., the low - frequency part) of the data. 2. **Finding a more suitable model**: Propose a new model - a distribution similar to the beta function - to better fit the entire ranking distribution, especially the tail. 3. **Explaining the universality of the model**: By transforming the problem into an algebraic problem, explore why this beta - like function distribution can fit data in many different fields very well. ### Main contributions - **Propose a beta - like function model**: The form is \( f(r)=K\left(R - r + 1\right)^b r^{-a}\), where \( a\) and \( b\) are parameters fitted from the data, \( r\) is the ranking, and \( R\) is the maximum ranking. - **Explain the mathematical basis of the model**: By transforming the problem into an algebraic problem, that is, finding the ranking of the product of a series of numbers, the author proves that this beta - like function distribution is related to a certain central limit theorem. - **Wide application**: Demonstrate the application of this model in multiple fields such as geography, impact factors of scientific journals, genomic codon usage frequencies, physical phenomena (such as slip events of granular materials), and the fitting effect is very good. ### Conclusion The paper proposes a new beta - like function model that can better fit ranking distributions in different fields, especially outperforming the traditional power law model in the tail. The universality of this model may be related to its mathematical basis, that is, a system composed of multiple subsystems or selections will lead to stretched - exponential frequency - ranking functions. This finding provides a new perspective for understanding ranking distributions in complex systems.