PGG.SV: a whole-genome-sequencing-based structural variant resource and data analysis platform

Yimin Wang,Yunchao Ling,Jiao Gong,Xiaohan Zhao,Hanwen Zhou,Bo Xie,Haiyi Lou,Xinhao Zhuang,Li Jin,The Han100K Initiative,Shaohua Fan,Guoqing Zhang,Shuhua Xu
DOI: https://doi.org/10.1093/nar/gkac905
IF: 14.9
2022-10-17
Nucleic Acids Research
Abstract:Structural variations (SVs) play important roles in human evolution and diseases, but there is a lack of data resources concerning representative samples, especially for East Asians. Taking advantage of both next-generation sequencing and third-generation sequencing data at the whole-genome level, we developed the database PGG.SV to provide a practical platform for both regionally and globally representative structural variants. In its current version, PGG.SV archives 584 277 SVs obtained from whole-genome sequencing data of 6048 samples, including 1030 long-read sequencing genomes representing 177 global populations. PGG.SV provides (i) high-quality SVs with fine-scale and precise genomic locations in both GRCh37 and GRCh38, covering underrepresented SVs in existing sequencing and microarray data; (ii) hierarchical estimation of SV prevalence in geographical populations; (iii) informative annotations of SV-related genes, potential functions and clinical effects; (iv) an analysis platform to facilitate SV-based case-control association studies and (v) various visualization tools for understanding the SV structures in the human genome. Taken together, PGG.SV provides a user-friendly online interface, easy-to-use analysis tools and a detailed presentation of results. PGG.SV is freely accessible via https://www.biosino.org/pggsv.
biochemistry & molecular biology
What problem does this paper attempt to address?