The landscape of gene–CDS–haplotype diversity in rice: Properties, population organization, footprints of domestication and breeding, and implications for genetic improvement

Fan Zhang,Chunchao Wang,Min Li,Yanru Cui,Yingyao Shi,Zhichao Wu,Zhiqiang Hu,Wensheng Wang,Jianlong Xu,Zhikang Li
DOI: https://doi.org/10.1016/j.molp.2021.02.003
IF: 27.5
2021-05-01
Molecular Plant
Abstract:Polymorphisms within gene coding regions represent the most important part of the overall genetic diversity of rice. We characterized the gene–coding sequence–haplotype (gcHap) diversity of 45 963 rice genes in 3010 rice accessions. With an average of 226 ± 390 gcHaps per gene in rice populations, rice genes could be classified into three main categories: 12 865 conserved genes, 10 254 subspecific differentiating genes, and 22 844 remaining genes. We found that 39 218 rice genes carry &gt;255 179 major gcHaps of potential functional importance. Most (87.5%) of the detected gcHaps were specific to subspecies or populations. The inferred proto-ancestors of local landrace populations reconstructed from conserved predominant (ancient) gcHaps correlated strongly with wild rice accessions from the same geographic regions, supporting a multiorigin (domestication) model of <em>Oryza sativa</em>. Past breeding efforts generally increased the gcHap diversity of modern varieties and caused significant frequency shifts in predominant gcHaps of 14 266 genes due to independent selection in the two subspecies. Low frequencies of "favorable" gcHaps at most known genes related to rice yield in modern varieties suggest huge potential for rice improvement by mining and pyramiding of favorable gcHaps. The gcHap data were demonstrated to have greater power than SNPs for the detection of causal genes that affect complex traits. The rice gcHap diversity dataset generated in this study would facilitate rice basic research and improvement in the future.
biochemistry & molecular biology,plant sciences
What problem does this paper attempt to address?