Large-scale whole-genome sequencing of three diverse Asian populations in Singapore

Degang Wu,Jinzhuang Dou,Xiaoran Chai,Claire Bellis,Andreas Wilm,Chih Chuan Shih,Wendy Wei Jia Soon,Nicolas Bertin,Chiea Chuen Khor,Michael DeGiorgio,Sonia Maria Davila Dominguez,Patrick Tan,Asim Shabbir,Angela Moh,Eng-King Tan,Jia Nee Foo,Tan Tock Seng Hospital Healthy Control Workgroup,Roger S. Foo,Carolyn S.P. Lam,A. Mark Richards,Ching-Yu Cheng,Tin Aung,Tien Yin Wong,Jianjun Liu,Chaolong Wang,on behalf of the SG10K Consortium,,
DOI: https://doi.org/10.1101/390070
2018-08-11
Abstract:Abstract Asian populations are currently underrepresented in human genetics research. Here we present whole-genome sequencing data of 4,810 Singaporeans from three diverse ethnic groups: 2,780 Chinese, 903 Malays, and 1,127 Indians. Despite a medium depth of 13.7×, we achieved essentially perfect (>99.8%) sensitivity and accuracy for detecting common variants and good sensitivity (>89%) for detecting extremely rare variants with 0.01) that were absent in the existing public databases, highlighting the importance of local population reference for genetic diagnosis. We describe fine-scale genetic structure of Singapore populations and their relationship to worldwide populations from the 1000 Genomes Project. In addition to revealing noticeable amounts of admixture among three Singapore populations and a Malay-related novel ancestry component that has not been captured by the 1000 Genomes Project, our analysis also identified some fine-scale features of genetic structure consistent with two waves of prehistoric migration from south China to Southeast Asia. Finally, we demonstrate that our data can substantially improve genotype imputation not only for Singapore populations, but also for populations across Asia and Oceania. These results highlight the genetic diversity in Singapore and the potential impacts of our data as a resource to empower human genetics discovery in a broad geographic region.
What problem does this paper attempt to address?