Towards Composing Data Aware Systems Biology Workflows on Cloud Platforms: A MeDICi-Based Approach

Ian Gorton,Yan Liu,Yin Jian,Anand Kulkarni,Adam Wynne
DOI: https://doi.org/10.1109/SERVICES.2011.22
2011-01-01
Abstract:Cloud computing is being increasingly adopted for deploying systems biology scientific workflows. Scientists developing these workflows use a wide variety of fragmented and competing data sets and computational tools of all scales to support their research. To this end, the synergy of client side workflow tools with cloud platforms is a promising approach to share and reuse data and workflows. In such systems, the location of data and computation is essential consideration in terms of quality of service for composing a scientific workflow across remote cloud platforms. In this paper, we describe a cloud-based workflow for genome annotation processing that is underpinned by MeDICi--a middleware designed for data intensive scientific applications. The workflow implementation incorporates an execution layer for exploiting data locality that routes the workflow requests to the processing steps that are colocated with the data. We demonstrate our approach by composing two workflows with the MeDICi pipelines.
What problem does this paper attempt to address?