Tripal and Galaxy: supporting reproducible scientific workflows for community biological databases

Shawna Spoor,Connor Wytko,Brian Soto,Ming Chen,Abdullah Almsaeed,Bradford Condon,Nic Herndon,Heidi Hough,Sook Jung,Meg Staton,Jill Wegrzyn,Dorrie Main,F Alex Feltus,Stephen P Ficklin
DOI: https://doi.org/10.1093/database/baaa032
2020-01-01
Abstract:Online biological databases housing genomics, genetic and breeding data can be constructed using the Tripal toolkit. Tripal is an open-source, internationally developed framework that implements FAIR data principles and is meant to ease the burden of constructing such websites for research communities. Use of a common, open framework improves the sustainability and manageability of such as site. Site developers can create extensions for their site and in turn share those extensions with others. One challenge that community databases often face is the need to provide tools for their users that analyze increasingly larger datasets using multiple software tools strung together in a scientific workflow on complicated computational resources. The Tripal Galaxy module, a 'plug-in' for Tripal, meets this need through integration of Tripal with the Galaxy Project workflow management system. Site developers can create workflows appropriate to the needs of their community using Galaxy and then share those for execution on their Tripal sites via automatically constructed, but configurable, web forms or using an application programming interface to power web-based analytical applications. The Tripal Galaxy module helps reduce duplication of effort by allowing site developers to spend time constructing workflows and building their applications rather than rebuilding infrastructure for job management of multi-step applications.
What problem does this paper attempt to address?