Data Science Tasks Implemented with Scripts Versus GUI-Based Workflows: the Good, the Bad, and the Ugly

Alexander K. Taylor,Yicong Huang,Junheng Hao,Xinyuan Lin,Xiusi Chen,Wei Wang,Chen Li
DOI: https://doi.org/10.1109/icdew61823.2024.00040
2024-01-01
Abstract:As leveraging large-scale data analytics becomes the norm for many applications, platforms used to develop these capabilities have become increasingly important. In this work, we compare the benefits and drawbacks of implementations of two commonly used data science platform paradigms: code-based scripts and GUI-based workflows. We implement tasks in both paradigms that provide examples of phases in the typical life cycle of a data science project, including data wrangling, machine learning (ML) model training, and inference. We examine the relative performance of the implementations under each paradigm in various experimental settings. We discuss the benefits and drawbacks associated with each platform implementation and provide a foundation for future work in comparing data science platform paradigms.
What problem does this paper attempt to address?