Abstract:Despite being introduced only a few years ago, Large Language Models (LLMs) are already widely used by developers for code generation. However, their application in automating other Software Engineering activities remains largely unexplored. Thus, in this paper, we report the first results of a study in which we are exploring the use of ChatGPT to support API migration tasks, an important problem that demands manual effort and attention from developers. Specifically, in the paper, we share our initial results involving the use of ChatGPT to migrate a client application to use a newer version of SQLAlchemy, an ORM (Object Relational Mapping) library widely used in Python. We evaluate the use of three types of prompts (Zero-Shot, One-Shot, and Chain Of Thoughts) and show that the best results are achieved by the One-Shot prompt, followed by the Chain Of Thoughts. Particularly, with the One-Shot prompt we were able to successfully migrate all columns of our target application and upgrade its code to use new functionalities enabled by SQLAlchemy's latest version, such as Python's asyncio and typing modules, while preserving the original code behavior.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to explore how large language models (LLMs) can be used to automatically support API migration tasks. Specifically, the authors chose ChatGPT as a tool and attempted to apply it to the task of migrating a client application from SQLAlchemy version 1.0 to version 2.0. SQLAlchemy is a widely used Python ORM (Object-Relational Mapping) library, and its 2.0 version introduces many new features, such as Python's static type checking and asynchronous programming support. Through this study, the authors hope to evaluate the effectiveness of different types of prompting methods (Zero-Shot, One-Shot, and Chain of Thoughts) in API migration and propose a framework to guide future automated migration work. ### Main Contributions: 1. **Proposed an initial framework for library migration using language models**: This framework includes a set of prompting methods and evaluation metrics to assess the correctness and quality of the migration task. 2. **Demonstrated preliminary results of using this framework to support the migration of client applications for a popular library (SQLAlchemy) in the Python ecosystem**: The authors analyzed the performance of different prompting methods during the migration process and discussed the quality of the migrated code and test results. ### Research Background: - **Importance of API Migration**: In modern software development, the rapid evolution of APIs brings many new features but often introduces breaking changes, requiring existing applications to be frequently updated to adapt to new API versions. - **Challenges of Manual Migration**: Although API migration is a critical activity, there is currently a lack of mature and effective tools to support this process, especially in production environments. - **Application of Large Language Models**: While LLMs have been widely used in software engineering tasks such as code generation, bug fixing, and code review, their application in the field of API migration is still in the exploratory stage. ### Research Methodology: - **Target API and Client Application**: The migration from SQLAlchemy 1.0 to 2.0 was chosen as the research subject, using a Python application named BiteStreams/fastapi-template as the client application. - **Migration Process**: First, manually migrate the client application to establish a baseline; then use GPT-4.0 for automatic migration, testing the Zero-Shot, One-Shot, and Chain of Thoughts prompting methods. - **Evaluation Metrics**: Evaluate the migration effectiveness through running tests, Pylint and Pyright static analysis tools, the number of migrated columns and methods, and other metrics. ### Results: - **Zero-Shot**: Performed the worst, unable to successfully run the application, and failed to correctly migrate any columns. - **One-Shot**: Performed the best, successfully migrated all columns and methods, all tests passed, but Pylint and Pyright scores decreased. - **Chain of Thoughts**: Second best, with migration effectiveness similar to One-Shot, but an import error prevented the application from running. ### Conclusion: - The **One-Shot** method performed the best across all evaluation metrics, generating a runnable application that passed all tests. - The **Chain of Thoughts** method was second best, with a minor import error but overall good migration effectiveness. - The **Zero-Shot** method performed the worst, failing to successfully migrate any columns. ### Future Work Directions: - Further optimize prompting methods and evaluation metrics. - Expand the scope of research to test more different types of applications and APIs. - Explore the possibility of migrating other programming languages and libraries.

Automatic Library Migration Using Large Language Models: First Results

ChatGPT and large language models in academia: opportunities and challenges

Comparative Analysis of CHATGPT and the evolution of language models

Exploring the potential of large language models and generative artificial intelligence (GPT): Applications in Library and Information Science

Large Language Models Meet NLP: A Survey

Analysis of ChatGPT on Source Code

ChatGPT Alternative Solutions: Large Language Models Survey

Evaluation of the Programming Skills of Large Language Models

LLM4DS: Evaluating Large Language Models for Data Science Code Generation

Unveiling ChatGPT's Usage in Open Source Projects: A Mining-based Study

On the assessment of generative AI in modeling tasks: an experience report with ChatGPT and UML

A Preliminary Study on Using Large Language Models in Software Pentesting

The emergence of Large Language Models (LLM) as a tool in literature reviews: an LLM automated systematic review

Large Language Models to the Rescue: Reducing the Complexity in Scientific Workflow Development Using ChatGPT

On the Effectiveness of Large Language Models in Domain-Specific Code Generation

Large Language Models: Their Success and Impact

Transformative Trends: A Comprehensive Review of Large Language Models (LLMs) in Healthcare

Enhancing Pipeline-Based Conversational Agents with Large Language Models

Comprehensive Evaluation and Insights into the Use of Large Language Models in the Automation of Behavior-Driven Development Acceptance Test Formulation

ChatGPT, Bard, and Large Language Models for Biomedical Research: Opportunities and Pitfalls

Can OpenSource beat ChatGPT? -- A Comparative Study of Large Language Models for Text-to-Code Generation