Correlation analysis between language gene polymorphism and geography/society parameter from twenty-six countries

Zongxuan Liu,Wei Xia,Bo Sun,Changlu Guo,Zhizhou Zhang
DOI: https://doi.org/10.21203/rs.3.rs-960107/v1
2021-01-01
Abstract:Abstract Human language diversity, as a biological phenotype, shall be genetically linked with language gene polymorphism. Meanwhile, this phenotype is historically shaped by local geographical/social factors. But how many language gene polymorphisms have direct correlations with some geography/society characteristics during the long-run evolution of human languages is an interesting question and largely remains uninvestigated. This study selected a series of geography/society factors (including 13 geographical factors and 21 social factors) from 26 countries and 111 single nucleotide polymorphisms (SNPs) randomly selected from 13 language genes. Principal component analysis (PCA) was performed to explore their potential correlations. Preliminary but interesting results were obtained as follow. (1) Most geographical parameters are concentrated into one cluster in the PCA diagram. The cluster contains 12 parameters that are positively correlated with each other; (2) PCA diagrams divide social parameters into four clusters, among which exist positive and negative correlations; (3) The strongest positive correlations were observed at one of ATP2C2 gene SNPs (ATP-1: rs78371901); the strongest negative correlations were found at one of NFXL1 gene SNPs (NFX-6: rs1440228); and the least correlations with language gene SNPs were observed at four geography/society factors: aash (Annual average rainfall), fore (Forest coverage), pden (Population density of the country) and rway (Runway traffic mode).
What problem does this paper attempt to address?