Using A Chinese Treebank to Measure Dependency Distance

Haitao Liu,Richard Hudson,Zhiwei Feng
DOI: https://doi.org/10.1515/cllt.2009.007
2009-01-01
Corpus Linguistics and Linguistic Theory
Abstract:This article describes a method for calculating the 'dependency distance' between the words in a text - i.e. the number of words that separate each word from the word on which it depends syntactically - and reports the results of applying this method to a Chinese treebank. This study shows that Chinese dependencies tend strongly to be governor-final and that the mean dependency distance of words is much higher for Chinese than for other languages that have been studied including English, German and Japanese. It is unclear whether this difference means that Chinese is syntactically more difficult to process.
What problem does this paper attempt to address?