Automatic Library Migration Using Large Language Models: First Results

Aylton Almeida,Laerte Xavier,Marco Tulio Valente
DOI: https://doi.org/10.1145/3674805.3690746
2024-09-26
Abstract:Despite being introduced only a few years ago, Large Language Models (LLMs) are already widely used by developers for code generation. However, their application in automating other Software Engineering activities remains largely unexplored. Thus, in this paper, we report the first results of a study in which we are exploring the use of ChatGPT to support API migration tasks, an important problem that demands manual effort and attention from developers. Specifically, in the paper, we share our initial results involving the use of ChatGPT to migrate a client application to use a newer version of SQLAlchemy, an ORM (Object Relational Mapping) library widely used in Python. We evaluate the use of three types of prompts (Zero-Shot, One-Shot, and Chain Of Thoughts) and show that the best results are achieved by the One-Shot prompt, followed by the Chain Of Thoughts. Particularly, with the One-Shot prompt we were able to successfully migrate all columns of our target application and upgrade its code to use new functionalities enabled by SQLAlchemy's latest version, such as Python's asyncio and typing modules, while preserving the original code behavior.
Software Engineering
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to explore how large language models (LLMs) can be used to automatically support API migration tasks. Specifically, the authors chose ChatGPT as a tool and attempted to apply it to the task of migrating a client application from SQLAlchemy version 1.0 to version 2.0. SQLAlchemy is a widely used Python ORM (Object-Relational Mapping) library, and its 2.0 version introduces many new features, such as Python's static type checking and asynchronous programming support. Through this study, the authors hope to evaluate the effectiveness of different types of prompting methods (Zero-Shot, One-Shot, and Chain of Thoughts) in API migration and propose a framework to guide future automated migration work. ### Main Contributions: 1. **Proposed an initial framework for library migration using language models**: This framework includes a set of prompting methods and evaluation metrics to assess the correctness and quality of the migration task. 2. **Demonstrated preliminary results of using this framework to support the migration of client applications for a popular library (SQLAlchemy) in the Python ecosystem**: The authors analyzed the performance of different prompting methods during the migration process and discussed the quality of the migrated code and test results. ### Research Background: - **Importance of API Migration**: In modern software development, the rapid evolution of APIs brings many new features but often introduces breaking changes, requiring existing applications to be frequently updated to adapt to new API versions. - **Challenges of Manual Migration**: Although API migration is a critical activity, there is currently a lack of mature and effective tools to support this process, especially in production environments. - **Application of Large Language Models**: While LLMs have been widely used in software engineering tasks such as code generation, bug fixing, and code review, their application in the field of API migration is still in the exploratory stage. ### Research Methodology: - **Target API and Client Application**: The migration from SQLAlchemy 1.0 to 2.0 was chosen as the research subject, using a Python application named BiteStreams/fastapi-template as the client application. - **Migration Process**: First, manually migrate the client application to establish a baseline; then use GPT-4.0 for automatic migration, testing the Zero-Shot, One-Shot, and Chain of Thoughts prompting methods. - **Evaluation Metrics**: Evaluate the migration effectiveness through running tests, Pylint and Pyright static analysis tools, the number of migrated columns and methods, and other metrics. ### Results: - **Zero-Shot**: Performed the worst, unable to successfully run the application, and failed to correctly migrate any columns. - **One-Shot**: Performed the best, successfully migrated all columns and methods, all tests passed, but Pylint and Pyright scores decreased. - **Chain of Thoughts**: Second best, with migration effectiveness similar to One-Shot, but an import error prevented the application from running. ### Conclusion: - The **One-Shot** method performed the best across all evaluation metrics, generating a runnable application that passed all tests. - The **Chain of Thoughts** method was second best, with a minor import error but overall good migration effectiveness. - The **Zero-Shot** method performed the worst, failing to successfully migrate any columns. ### Future Work Directions: - Further optimize prompting methods and evaluation metrics. - Expand the scope of research to test more different types of applications and APIs. - Explore the possibility of migrating other programming languages and libraries.