Characterizing Ranked Chinese Syllable-to-Character Mapping Spectrum: A Bridge between the Spoken and Written Chinese Language

Wentian Li
DOI: https://doi.org/10.1080/09296174.2013.773140
2013-05-01
Journal of Quantitative Linguistics
Abstract:One important aspect of the relationship between spoken and written Chinese is the ranked syllable-to-character mapping spectrum, which is the ranked list of syllables by the number of characters that map to the syllable. Previously, this spectrum was analysed for more than 400 syllables without distinguishing the four intonations. In the current study, the spectrum with 1280 toned syllables is analysed by logarithmic function, Beta rank function, and piecewise logarithmic function. Out of the three fitting functions, the two-piece logarithmic function fits the data the best, both by the smallest sum of squared errors (SSE) and by the lowest Akaike information criterion (AIC) value. The Beta rank function is the close second. By sampling from a Poisson distribution whose parameter value is chosen from the observed data, we empirically estimate the chance probability that Beta function could outperform two-piece logarithmic function, to be 0.12-0.14. For practical purposes, the piecewise logarithmic function and the Beta rank function can be considered a tie.
linguistics
What problem does this paper attempt to address?