The Structural Complexity of Chinese Words and Its Relationship with Word Frequency.
Xinpei Hong,Wei Huang,Haitao Liu
DOI: https://doi.org/10.1080/09296174.2023.2231743
2023-01-01
Journal of Quantitative Linguistics
Abstract:The morphological synergetic model has yet to be fully tested in typical analytic languages. The quantification of Chinese morphology and its relationship with word frequency can help construct and test the morphological synergetic model in Chinese. Based on the Lancaster Corpus of Mandarin Chinese, this study proposes a quantitative method for the structural complexity of Chinese words by Kolmogorov complexity, further examining the interrelation between the structural complexity of words (SCW) and word frequency. Results show that the SCW of words formed by combining morphemes in multiple assembling ways is generally higher than that in a single assembling way among the seven structural types of Chinese words, but derivational affixes impact SCW significantly. The higher SCW, the lower the word frequency. Given the combined effects of morpheme properties, y=Ax(-b)e(-cx) is more suitable to describe the inverse relationship than y=Ax(-b). Additionally, the higher the word frequency, the lower SCW. The delayed negative feedback causes small-scale fluctuations, but y=Ax(-b)e(-cx) can effectively describe the overall interactions between the two. From the internal mechanism, word frequency changes first, thus causing changes in word structure; In turn, for communication effectiveness, the structure of words becomes more complex to carry more meaning, thus influencing word frequency.