Putting everything in its place: using the INSDC compliant Pathogen Data Object Model to better structure genomic data submitted for public health applications
Ruth E Timme,Ilene Karsch-Mizrachi,Zahra Waheed,Masanori Arita,Duncan MacCannell,Finlay Maguire,Robert Petit Iii,Andrew J Page,Catarina Inês Mendes,Muhammad Ibtisam Nasar,Paul Oluniyi,Andrea D Tyler,Amogelang R Raphenya,Jennifer L Guthrie,Idowu Olawoye,Gabriele Rinck,Colman O'Cathail,John Lees,Guy Cochrane,Carla Cummins,J Rodney Brister,William Klimke,Michael Feldgarden,Emma Griffiths
DOI: https://doi.org/10.1099/mgen.0.001145
Abstract:Fast, efficient public health actions require well-organized and coordinated systems that can supply timely and accurate knowledge. Public databases of pathogen genomic data, such as the International Nucleotide Sequence Database Collaboration (INSDC), have become essential tools for efficient public health decisions. However, these international resources began primarily for academic purposes, rather than for surveillance or interventions. Now, queries need to access not only the whole genomes of multiple pathogens but also make connections using robust contextual metadata to identify issues of public health relevance. Databases that over time developed a patchwork of submission formats and requirements need to be consistently organized and coordinated internationally to allow effective searches.To help resolve these issues, we propose a common pathogen data structure called the Pathogen Data Object Model (DOM) that will formalize the minimum pieces of sequence data and contextual data necessary for general public health uses, while recognizing that submitters will likely withhold a wide range of non-public contextual data. Further, we propose contributors use the Pathogen DOM for all pathogen submissions (bacterial, viral, fungal, and parasites), which will simplify data submissions and provide a consistent and transparent data structure for downstream data analyses. We also highlight how improved submission tools can support the Pathogen DOM, offering users additional easy-to-use methods to ensure this structure is followed.