Entropy in Different Text Types.

Ruina Chen,Haitao Liu,Gabriel Altmann
DOI: https://doi.org/10.1093/llc/fqw008
IF: 1.299
2016-01-01
Digital Scholarship in the Humanities
Abstract:The present investigation is an attempt to investigate how the unique linguistic profile of different text types can be reflected in their respective entropy characteristics. With samples from the Lancaster Corpus of Mandarin Chinese and the Freiburg-Brown corpus of American English, the research investigates entropy performances in two dimensions: the relative entropy of words and their partof- speech (POS) on different sentential positions, and entropy of aspect markers. Our research yields the following results: First, it shows a strikingly similar distribution pattern in Chinese and English concerning the relative entropy of wordforms and POS-forms on different sentential positions. The relative entropy of word-forms in descending order yields: news > essays > official > academic > fiction, and the POS-forms yields: fiction > essays > news > academic > official. The relative entropy of POS-forms may be a more reliable indicator of syntactical differences, which helps to distinguish dichotomous 'narrative vs. expository' text types in both Chinese and English. Second, there exists a cross-linguistic difference concerning entropy of aspect markers, namely, Chinese displays higher relative entropy than English. This indicates that aspect-marking in terms of variation is more prominent in Chinese grammar than in English. The 'narrative vs. expository distinction' is also identified by entropy of aspect markers in both Chinese and English, though more obviously in Chinese.
What problem does this paper attempt to address?