Scalable, accessible and reproducible reference genome assembly and evaluation in Galaxy

Delphine Larivière,Linelle Abueg,Nadolina Brajuka,Cristóbal Gallardo-Alba,Bjorn Grüning,Byung June Ko,Alex Ostrovsky,Marc Palmada-Flores,Brandon D. Pickett,Keon Rabbani,Agostinho Antunes,Jennifer R. Balacco,Mark J. P. Chaisson,Haoyu Cheng,Joanna Collins,Melanie Couture,Alexandra Denisova,Olivier Fedrigo,Guido Roberto Gallo,Alice Maria Giani,Grenville MacDonald Gooder,Kathleen Horan,Nivesh Jain,Cassidy Johnson,Heebal Kim,Chul Lee,Tomas Marques-Bonet,Brian O'Toole,Arang Rhie,Simona Secomandi,Marcella Sozzoni,Tatiana Tilley,Marcela Uliano-Silva,Marius van den Beek,Robert W. Williams,Robert M. Waterhouse,Adam M. Phillippy,Erich D. Jarvis,Michael C. Schatz,Anton Nekrutenko,Giulio Formenti
DOI: https://doi.org/10.1038/s41587-023-02100-3
IF: 46.9
2024-01-27
Nature Biotechnology
Abstract:The Earth BioGenome Project aims to produce reference genomes for all ~1.8 million known eukaryotic species over the next decade 1,2,3,4 . Achieving this goal will require the current pace of reference genome production to increase by at least two orders of magnitude 1 . Automation of the assembly process with a pipeline that is widely accessible to any research group will be required to achieve this speed-up. Enabling this goal requires sustained effort in three major areas: genome assembly optimization and best-practice development, computational infrastructure provisioning, and dissemination and training.
biotechnology & applied microbiology
What problem does this paper attempt to address?