SWORTS: A Scientific Workflow Retrieval Approach by Learning Textual and Structural Semantics.
Yang Gu,Jian Cao,Shiyou Qian,Wei Guan
DOI: https://doi.org/10.1109/tsc.2023.3315478
IF: 11.019
2023-01-01
IEEE Transactions on Services Computing
Abstract:Finding scientific workflow models that can be reused or repurposed from public repositories is becoming popular in the scientific community. Currently, the retrieval approaches for workflow models are mainly based on text matching between queries and the descriptions of models. However, the structure information of these models, which includes inputs, outputs, data processing modules, and datalinks connecting modules, expresses their functions in a more detailed manner, yet this information is not used in the model retrieval process. Therefore, we propose a two-stage framework for Scientific WOrkflow Retrieval by learning Textual and Structural semantics (SWORTS). The framework comprises a workflow pre-selection step and a workflow ranking step. Specifically, we use text similarity approaches to quickly identify candidate workflow models in the first step. Then, a hierarchy-based matching degree prediction model, which takes both textual and structural features into account, is trained to predict and rank the degree of semantic matching between requirement specifications and candidate workflows. The experiment results on real-world datasets demonstrate that SWORTS achieves the best performance with relatively balanced effectiveness and efficiency among state-of-the-art methods.