Character Usage in Chinese Short Message Service (SMS): a Real-World Study in Mainland China

Xi Chen,Chenhui Guo,Michael Chau,Weihua Zhou
DOI: https://doi.org/10.1504/ijmc.2013.056954
2013-01-01
International Journal of Mobile Communications
Abstract:Short message service SMS is an important component of modern mobile services. Given unique characteristics of Chinese language, it is imperative to conduct study to understand characteristic of language usage patterns in Chinese SMS so that important facts like why and how people in China use SMS can be discovered. In this paper, we report an analysis of Chinese SMS logs from three different provinces in China. A computational approach was applied to extract n-grams from logs of SMS. The language usage patterns reported in this paper consist of two aspects: 1 most popular n-grams that represent what types of information were transmitted via SMS; 2 distribution of n-grams in comparison with Zipf laws. We discovered that, compared with other forms of free text in Chinese, SMS contains more conversational elements, which are expressed mostly in bigrams. Trigrams, 4-and 5-grams are less frequent but are closely connected to commercial activities, which may indicate the commercial needs of SMS users.
What problem does this paper attempt to address?