Advancing Bug Detection in Fastjson2 with Large Language Models Driven Unit Test Generation

Zhiyuan Zhong,Sinan Wang,Hailong Wang,Shaojin Wen,Hao Guan,Yida Tao,Yepang Liu
2024-10-12
Abstract:Data-serialization libraries are essential tools in software development, responsible for converting between programmable data structures and data persistence formats. Among them, JSON is the most popular choice for exchanging data between different systems and programming languages, while JSON libraries serve as the programming toolkit for this task. Despite their widespread use, bugs in JSON libraries can cause severe issues such as data inconsistencies and security vulnerabilities. Unit test generation techniques are widely adopted to identify bugs in various libraries. However, there is limited systematic testing effort specifically for exposing bugs within JSON libraries in industrial practice. In this paper, we propose JSONTestGen, an approach leveraging large language models (LLMs) to generate unit tests for fastjson2, a popular open source JSON library from Alibaba. Pre-trained on billions of open-source text and code corpora, LLMs have demonstrated remarkable abilities in programming tasks. Based on historical bug-triggering unit tests, we utilize LLMs to generate more diverse test cases by incorporating JSON domain-specific mutation rules. To systematically and efficiently identify potential bugs, we adopt differential testing on the results of the generated unit tests. Our evaluation shows that JSONTestGen outperforms existing test generation tools in unknown defect detection. With JSONTestGen, we found 34 real bugs in fastjson2, 30 of which have already been fixed, including 12 non-crashing bugs. While manual inspection reveals that LLM-generated tests can be erroneous, particularly with self-contradictory assertions, we demonstrate that LLMs have the potential for classifying false-positive test failures. This suggests a promising direction for improved test oracle automation in the future.
Software Engineering
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the potential defects and vulnerability problems in JSON libraries (especially Alibaba's open - source fastjson2 library). Specifically, the authors propose a method based on large - language models (LLMs) - **JSONTESTGEN**, which is used to generate unit tests in order to discover unknown errors in fastjson2 more systematically and effectively, including non - crashing functional errors. #### Background and Motivation 1. **Importance of JSON Libraries** - JSON is a widely used data - exchange format in modern software development. - JSON libraries are responsible for converting programmable data structures into persistent data formats and for data exchange between different systems and programming languages. 2. **Existing Problems** - Although JSON libraries are widely used, they may contain errors that lead to serious consequences, such as data inconsistency and security vulnerabilities. - Although existing unit - test - generation techniques can identify certain types of errors, they are still insufficient in tests specifically for JSON libraries, especially in detecting non - crashing logic errors. 3. **Research Motivation** - Propose a method of using large - language models to automatically generate more diverse unit tests in order to cover various APIs of JSON libraries more comprehensively. - Through differential testing, compare the results of different versions or implementations to identify potential unknown errors. #### Solution The authors propose a method named **JSONTESTGEN**, and the main steps are as follows: 1. **Collect Historical Unit Tests** - Collect unit tests related to historical issues from the GitHub repository of fastjson2 as the original data set. 2. **Understanding Stage** - Use large - language models to summarize the original unit tests and extract key information such as target APIs and core operations. 3. **Generation Stage** - Combine the summary information and specific JSON - domain mutation rules, and use large - language models to generate new unit tests. 4. **Differential Testing** - Execute the newly generated unit tests and identify potential errors by comparing the results of different JSON - library implementations. #### Main Contributions - **First Application of Large - Language Models in JSON - Library Error Detection**: By learning existing unit tests, automatically generate diverse test cases for bug detection. - **Design of Effective Prompting Strategies**: Combine JSON - specific mutation rules to guide large - language models to generate high - quality unit tests. - **Successful Discovery of Unknown Errors**: Discover 34 unknown errors in fastjson2, 12 of which are non - crashing errors, which are difficult to detect for existing tools. - **Explore the Direction of Improving Test Automation**: Analyze failure cases and explore the potential of large - language models in identifying false - positive test failures caused by incorrect test logic. Through this method, the authors demonstrate the great potential of large - language models in the field of software testing, especially in terms of improving test coverage and discovering complex errors.