Fractal Power Law in Literary English

L.L. Goncalves,L.B. Goncalves
DOI: https://doi.org/10.1016/j.physa.2005.06.049
2005-06-04
Abstract:We present in this paper a numerical investigation of literary texts by various well-known English writers, covering the first half of the twentieth century, based upon the results obtained through corpus analysis of the texts. A fractal power law is obtained for the lexical wealth defined as the ratio between the number of different words and the total number of words of a given text. By considering as a signature of each author the exponent and the amplitude of the power law, and the standard deviation of the lexical wealth, it is possible to discriminate works of different genres and writers and show that each writer has a very distinct signature, either considered among other literary writers or compared with writers of non-literary texts. It is also shown that, for a given author, the signature is able to discriminate between short stories and novels.
Other Condensed Matter,Physics and Society
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to explore whether the distribution of lexical wealth in literary English texts follows the fractal power law, and to use this characteristic to distinguish the works of different authors and different literary genres. Specifically, the paper focuses on the following aspects: 1. **Verifying the fractal power law**: Researchers hope to verify whether the lexical wealth in these texts conforms to the fractal power law distribution by analyzing English literary works in the first half of the 20th century. 2. **Author's signature**: By analyzing the exponent \(\phi\) and amplitude \(A\) in the power law and the standard deviation \(\sigma\) of lexical wealth, determine the unique "signature" of each author, so as to be able to distinguish the works of different writers. 3. **Distinguishing between literary and non - literary texts**: Researchers also hope to use these parameters to distinguish between literary texts (such as short stories and novels) and non - literary texts (such as news reports). 4. **Distinguishing different literary genres**: Among the works of different genres by the same author (for example, short stories and novels), researchers hope to identify different literary genres through the changes of these parameters. ### Specific problem summary - **Does the fractal power law exist?**: Researchers assume that the relationship between lexical wealth and total vocabulary in literary texts can be described by the fractal power law, that is: \[ N = A k^{-\phi} \] where \(N\) is the total vocabulary, \(k\) is the proportion of different types of words, \(A\) is the amplitude, and \(\phi\) is the power - law exponent. - **Author's signature**: Whether the power - law exponent \(\phi\), amplitude \(A\) and standard deviation \(\sigma\) of lexical wealth of each author can be used as their unique "signature" to distinguish different writers. - **Distinguishing between literary and non - literary texts**: By comparing the parameters of literary texts and non - literary texts, researchers hope to find a method to distinguish these two types of texts. - **Distinguishing different literary genres**: Among the works of different genres by the same author, whether these parameters will change, thereby helping to identify different literary genres. Through the research of these problems, the paper aims to provide a new method based on statistics and fractal theory for literary analysis, and increase the objectivity and scientific nature of literary research.