Across Time, Space, and Genres: Measuring Probabilistic Grammar Distances Between Varieties of Mandarin

Yi Li,Benedikt Szmrecsanyi,Weiwei Zhang
DOI: https://doi.org/10.1515/lingvan-2022-0134
2024-01-01
Linguistics Vanguard
Abstract:This paper aims to quantify distances between varieties of Mandarin (diachronic, regional, and situational) as a function of the similarity in the choice between syntactic variants in the Mandarin theme-recipient alternation (y & ubreve;/gei dative alternation). We use a novel corpus-based method, Variation-Based Distance and Similarity Modeling, which draws inspiration from work in comparative sociolinguistics and quantitative dialectometry. Analysis reveals that, while there is a relatively stable probabilistic grammar across the investigated varieties, historical varieties do exhibit a relatively higher degree of heterogeneity than synchronic varieties. Despite the overall high similarity of the latter, we identify substantial probabilistic differences between fictional writings of Modern Mainland Mandarin and all other synchronic varieties. Our findings thus provide evidence in support of the hypothesis that the transition from Early Mandarin to Modern Mandarin over the past two centuries has witnessed salient grammatical shifts and also empirically demonstrate the interaction between genre variability and regional variability in Modern Mandarin.
What problem does this paper attempt to address?