MAESTRO: a lightweight ontology-based framework for composing and analyzing script-based scientific experiments
DOI: https://doi.org/10.1007/s10115-024-02134-2
IF: 2.7
2024-06-11
Knowledge and Information Systems
Abstract:Over the last decades, there has been a rapid growth in the number of scientific experiments implemented as computational simulations. These experiments typically consist of multiple steps, where different programs, in-house scripts, or services may be used at each step. Workflows have served as an abstraction to model such experiments, and such workflows can be implemented in various ways, with many users choosing scripting languages like Python. Although scripts offer users the flexibility to compose workflows with complex constructs and data structures, they typically represent isolated workflows rather than encompassing the entire experiment. Within the same experiment, users may explore different configurations to confirm or refute their hypotheses, leading to the execution of different (but associated) workflows. Composing and analyzing scientific experiments associated with multiple workflows implemented as scripts is an open, yet important, task. Poor choices during composition can lead to inconsistencies, such as format incompatibility and problems in script dependencies. Moreover, even with a well-specified and properly executed script, analyzing the data produced from an isolated workflow without knowledge of the experiment's structure, domain terms, and specifications can be challenging. In this article, we introduce MAESTRO , a lightweight framework based on the use of ontologies and provenance to assist in the composition and analysis of experiments implemented using scripts. MAESTRO integrates the concept of Experiment Lines to represent the workflow at an abstract level and employs reasoners to derive a script-based workflow based on the abstract experiment representation and to support analytical queries. The feasibility of MAESTRO was evaluated through a study in the bioinformatics domain, receiving positive feedback from experts in e-science.
computer science, information systems, artificial intelligence