Chinese Word Co-occurrence Network: Its Small World Effect and Scale-free Property

Zhi-yuan LIU,Mao-song SUN
Abstract:Some perspectives of human languages can be characterized by complex network analysis.In this paper,word co-occurrence networks for the Chinese language are automatically constructed based on very large manually word-segmented Chinese corpora with different size and style at first.Then systematic observations on these networks are made from the complex network's point of view.Experimental results show that these networks display two important features of complex networks:(1)The average distance between two words is 2.63-2.75,and the clustering coefficient is much greater than that given by a random network with the same parameters,which exhibits a typical small-world effect;and(2)The degree distributions of these networks generally obey the power-law,i.e.,the scale-free property.In addition,quantitative analysis is conducted for the kernel lexicons derived from these experiments.
What problem does this paper attempt to address?