Initial Comparison of Linguistic Networks Measures for Parallel Texts

Kristina Ban,Ana Meštrović,Sanda Martinčić-Ipšić
DOI: https://doi.org/10.48550/arXiv.1405.1893
2014-07-18
Abstract:This paper presents preliminary results of Croatian syllable networks analysis. Syllable network is a network in which nodes are syllables and links between them are constructed according to their connections within words. In this paper we analyze networks of syllables generated from texts collected from the Croatian Wikipedia and Blogs. As a main tool we use complex network analysis methods which provide mechanisms that can reveal new patterns in a language structure. We aim to show that syllable networks have much higher clustering coefficient in comparison to Erdös-Renyi random networks. The results indicate that Croatian syllable networks exhibit certain properties of a small world networks. Furthermore, we compared Croatian syllable networks with Portuguese and Chinese syllable networks and we showed that they have similar properties.
Computation and Language,Social and Information Networks,Physics and Society
What problem does this paper attempt to address?