LLM for Test Script Generation and Migration: Challenges, Capabilities, and Opportunities

Shengcheng Yu,Chunrong Fang,Yuchen Ling,Chentian Wu,Zhenyu Chen

2023-09-24

Abstract:This paper investigates the application of large language models (LLM) in the domain of mobile application test script generation. Test script generation is a vital component of software testing, enabling efficient and reliable automation of repetitive test tasks. However, existing generation approaches often encounter limitations, such as difficulties in accurately capturing and reproducing test scripts across diverse devices, platforms, and applications. These challenges arise due to differences in screen sizes, input modalities, platform behaviors, API inconsistencies, and application architectures. Overcoming these limitations is crucial for achieving robust and comprehensive test automation. By leveraging the capabilities of LLMs, we aim to address these challenges and explore its potential as a versatile tool for test automation. We investigate how well LLMs can adapt to diverse devices and systems while accurately capturing and generating test scripts. Additionally, we evaluate its cross-platform generation capabilities by assessing its ability to handle operating system variations and platform-specific behaviors. Furthermore, we explore the application of LLMs in cross-app migration, where it generates test scripts across different applications and software environments based on existing scripts. Throughout the investigation, we analyze its adaptability to various user interfaces, app architectures, and interaction patterns, ensuring accurate script generation and compatibility. The findings of this research contribute to the understanding of LLMs' capabilities in test automation. Ultimately, this research aims to enhance software testing practices, empowering app developers to achieve higher levels of software quality and development efficiency.

Software Engineering

What problem does this paper attempt to address?

The paper primarily explores the applications, challenges, and opportunities of large language models (LLMs) in the generation and migration of mobile application test scripts. The core objective of the research is to leverage the capabilities of LLMs to address a series of key issues in mobile application testing, specifically including: 1. **Scenario-based Test Script Generation**: - Evaluating whether LLMs can effectively generate corresponding test scripts through natural language descriptions of specific application scenarios. - The study found that while LLMs can generate syntactically correct scripts that align with expected processes, they may encounter technical issues during direct execution, such as using incompatible APIs or random pop-ups interrupting script execution. - The paper proposes a conversational approach, guiding LLMs step-by-step to explore the application and automatically generate test scripts, reducing the need for manual intervention. 2. **Cross-platform Test Script Migration**: - Evaluating whether LLMs can effectively migrate test scripts from one platform to another (e.g., from Android to iOS). - Experiments in login scenarios indicate that despite some technical limitations (such as improper focus management of input fields), LLMs can generally succeed in generating new executable test scripts. 3. **Cross-app Test Script Migration**: - Evaluating the ability of LLMs to migrate test scripts between different applications with similar functionalities. - In this direction, although LLMs show some capability, challenges remain, especially in handling complex context memory and adapting to new environments. In summary, the paper explores the potential applications of LLMs in the field of mobile application testing through three research questions (RQs) and identifies some limitations and future research directions. Through these studies, the paper aims to support software testing practices, helping developers improve software quality and development efficiency.

LLM for Test Script Generation and Migration: Challenges, Capabilities, and Opportunities

Software Testing with Large Language Models: Survey, Landscape, and Vision

Automated Test Transfer Across Android Apps Using Large Language Models

Multi-language Unit Test Generation using LLMs

Large-scale, Independent and Comprehensive study of the power of LLMs for test case generation

Are We Testing or Being Tested? Exploring the Practical Applications of Large Language Models in Software Testing

TESTEVAL: Benchmarking Large Language Models for Test Case Generation

A Case Study on Test Case Construction with Large Language Models: Unveiling Practical Insights and Challenges

Leveraging Large Language Models for Automated Web-Form-Test Generation: An Empirical Study

LLM4Fin: Fully Automating LLM-Powered Test Case Generation for FinTech Software Acceptance Testing

Exploring Automated Assertion Generation Via Large Language Models

Large Language Models for Mobile GUI Text Input Generation: An Empirical Study

On the Evaluation of Large Language Models in Unit Test Generation

Domain Knowledge is All You Need: A Field Deployment of LLM-Powered Test Case Generation in FinTech Domain

Optimizing Search-Based Unit Test Generation with Large Language Models: an Empirical Study

ScriptSmith: A Unified LLM Framework for Enhancing IT Operations via Automated Bash Script Generation, Assessment, and Refinement

A System for Automated Unit Test Generation Using Large Language Models and Assessment of Generated Test Suites

A Study of Using Multimodal LLMs for Non-Crash Functional Bug Detection in Android Apps

Large Language Models to Generate System-Level Test Programs Targeting Non-functional Properties

A Tool for Test Case Scenarios Generation Using Large Language Models

Towards Generating Executable Metamorphic Relations Using Large Language Models