AGIDB: a versatile database for genotype imputation and variant decoding across species

Kaili Zhang,Jiete Liang,Yuhua Fu,Jinyu Chu,Liangliang Fu,Yongfei Wang,Wangjiao Li,You Zhou,Jinhua Li,Xiaoxiao Yin,Haiyan Wang,Xiaolei Liu,Chunyan Mou,Chonglong Wang,Heng Wang,Xinxing Dong,Dawei Yan,Mei Yu,Shuhong Zhao,Xinyun Li,Yunlong Ma
DOI: https://doi.org/10.1093/nar/gkad913
IF: 14.9
2023-10-27
Nucleic Acids Research
Abstract:Abstract The high cost of large-scale, high-coverage whole-genome sequencing has limited its application in genomics and genetics research. The common approach has been to impute whole-genome sequence variants obtained from a few individuals for a larger population of interest individually genotyped using SNP chip. An alternative involves low-coverage whole-genome sequencing (lcWGS) of all individuals in the larger population, followed by imputation to sequence resolution. To overcome limitations of processing lcWGS data and meeting specific genotype imputation requirements, we developed AGIDB (https://agidb.pro), a website comprising tools and database with an unprecedented sample size and comprehensive variant decoding for animals. AGIDB integrates whole-genome sequencing and chip data from 17 360 and 174 945 individuals, respectively, across 89 species to identify over one billion variants, totaling a massive 688.57 TB of processed data. AGIDB focuses on integrating multiple genotype imputation scenarios. It also provides user-friendly searching and data analysis modules that enable comprehensive annotation of genetic variants for specific populations. To meet a wide range of research requirements, AGIDB offers downloadable reference panels for each species in addition to its extensive dataset, variant decoding and utility tools. We hope that AGIDB will become a key foundational resource in genetics and breeding, providing robust support to researchers.
biochemistry & molecular biology
What problem does this paper attempt to address?