Do Large Language Models Speak Scientific Workflows?

Orcun Yildiz,Tom Peterka
2024-12-14
Abstract:With the advent of large language models (LLMs), there is a growing interest in applying LLMs to scientific tasks. In this work, we conduct an experimental study to explore applicability of LLMs for configuring, annotating, translating, explaining, and generating scientific workflows. We use 5 different workflow specific experiments and evaluate several open- and closed-source language models using state-of-the-art workflow systems. Our studies reveal that LLMs often struggle with workflow related tasks due to their lack of knowledge of scientific workflows. We further observe that the performance of LLMs varies across experiments and workflow systems. Our findings can help workflow developers and users in understanding LLMs capabilities in scientific workflows, and motivate further research applying LLMs to workflows.
Human-Computer Interaction
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to evaluate the applicability of large language models (LLMs) in scientific workflows. Specifically, the author explores and evaluates the performance of LLMs in configuring, annotating, translating, interpreting, and generating scientific workflows through a series of experiments. ### Core Problems of the Paper 1. **Complexity of Scientific Workflows**: - Scientific workflows usually involve multiple inter - related tasks and have extensive data and computing requirements. - Current LLMs may lack in - depth understanding of scientific workflows, resulting in poor performance when handling these tasks. 2. **Difficulty in Using Scientific Workflow Systems**: - Although scientific workflow systems can simplify task management and data exchange, many scientists find these systems difficult to use and often choose to run tasks manually or develop their own solutions. - LLMs have the potential to help solve these problems, but their capabilities need in - depth research and evaluation. 3. **Limitations of Existing Research**: - Previous research has mainly focused on specific high - performance computing (HPC) - related tasks, such as code generation, annotation, answering queries, etc. - There is a lack of comprehensive research on the wide application of LLMs in complete workflow systems. ### Research Objectives - **Evaluate the Capabilities of LLMs**: Through multiple experiments, evaluate the performance of different LLMs in scientific workflows, including configuring, annotating, translating, interpreting, and generating workflows. - **Reveal the Advantages and Limitations of LLMs**: Identify the strengths and weaknesses of LLMs in handling scientific workflow tasks. - **Promote Further Research**: Provide understanding for workflow developers and users regarding the application of LLMs in scientific workflows and stimulate more research on applying LLMs to scientific workflows. ### Experimental Setup The author selected five different experiments to evaluate the performance of LLMs in scientific workflows: 1. **Workflow Configuration**: Research on the ability of LLMs to generate workflow configuration scripts. 2. **Task Code Annotation**: Evaluate the ability of LLMs to automatically annotate user task codes. 3. **Task Code Translation**: Test the ability of LLMs to translate task codes between different workflow systems. 4. **Workflow Interpretation**: Evaluate the ability of LLMs to understand and interpret scientific workflows. 5. **Develop Mini - Applications**: Require LLMs to develop workflow benchmarking programs that combine HPC and AI tasks. ### Conclusions Through these experiments, the author hopes to provide an in - depth understanding of the performance of LLMs in scientific workflows and point out their advantages and limitations, thereby providing guidance for future research and applications. --- In summary, this paper aims to evaluate the applicability of large language models in scientific workflows through empirical research, reveal their potential advantages and limitations, and hope to provide valuable insights for the development and application of scientific workflows.