The Genome Sequence Archive Family: Toward Explosive Data Growth and Diverse Data Types

Tingting Chen,Xu Chen,Sisi Zhang,Junwei Zhu,Bixia Tang,Anke Wang,Lili Dong,Zhewen Zhang,Caixia Yu,Yanling Sun,Lianjiang Chi,Huanxin Chen,Shuang Zhai,Yubin Sun,Li Lan,Xin Zhang,Jingfa Xiao,Yiming Bao,Yanqing Wang,Zhang Zhang,Wenming Zhao
DOI: https://doi.org/10.1016/j.gpb.2021.08.001
2021-08-01
Genomics, Proteomics and Bioinformatics
Abstract:The Genome Sequence Archive (GSA) is a data repository for archiving raw sequence data, which provides data storage and sharing services for worldwide scientific communities. Considering explosive data growth with diverse data types, here we present the GSA family by expanding into a set of resources for raw data archive with different purposes, namely, GSA (https://ngdc.cncb.ac.cn/gsa/), GSA for Human (GSA-Human, https://ngdc.cncb.ac.cn/gsa-human/), and Open Archive for Miscellaneous Data (OMIX, https://ngdc.cncb.ac.cn/omix/). Compared with the 2017 version, GSA has been significantly updated in data model, online functionalities, and web interfaces. GSA-Human, as a new partner of GSA, is a data repository specialized in human genetics-related data with controlled access and security. OMIX, as a critical complement to the two resources mentioned above, is an open archive for miscellaneous data. Together, all these resources form a family of resources dedicated to archiving explosive data with diverse types, accepting data submissions from all over the world, and providing free open access to all publicly available data in support of worldwide research activities.
English Else
What problem does this paper attempt to address?