A Large-scale Investigation of Semantically Incompatible APIs behind Compatibility Issues in Android Apps

Shidong Pan,Tianchen Guo,Lihong Zhang,Pei Liu,Zhenchang Xing,Xiaoyu Sun
2024-06-27
Abstract:Application Programming Interface (API) incompatibility is a long-standing issue in Android application development. The rapid evolution of Android APIs results in a significant number of API additions, removals, and changes between adjacent versions. Unfortunately, this high frequency of alterations may lead to compatibility issues, often without adequate notification to developers regarding these changes. Although researchers have proposed some work on detecting compatibility issues caused by changes in API signatures, they often overlook compatibility issues stemming from sophisticated semantic changes. In response to this challenge, we conducted a large-scale discovery of incompatible APIs in the Android Open Source Project (AOSP) by leveraging static analysis and pre-trained Large Language Models (LLMs) across adjacent versions. We systematically formulate the problem and propose a unified framework to detect incompatible APIs, especially for semantic changes. It's worth highlighting that our approach achieves a 0.83 F1-score in identifying semantically incompatible APIs in the Android framework. Ultimately, our approach detects 5,481 incompatible APIs spanning from version 4 to version 33. We further demonstrate its effectiveness in supplementing the state-of-the-art methods in detecting a broader spectrum of compatibility issues (+92.3%) that have been previously overlooked.
Software Engineering
What problem does this paper attempt to address?
This paper attempts to solve the API incompatibility problem in Android application development, especially the detection of semantically incompatible APIs. Specifically: 1. **API Incompatibility Problem**: - The rapid development of the Android operating system has led to frequent API updates, including adding, removing, and modifying APIs. These changes may cause compatibility problems for applications on different versions of the Android system, such as crashes or abnormal behavior. - Although existing research has proposed some methods to detect compatibility problems caused by API signature changes, these methods often overlook compatibility problems caused by API semantic changes (i.e., behavior changes). 2. **Research Objectives**: - This paper aims to discover and detect semantically incompatible APIs in the Android Open - Source Project (AOSP) through large - scale investigation and analysis, especially the changes between adjacent versions. - Propose a unified framework that combines static analysis and pre - trained large - scale language models (LLMs) to systematically identify semantically incompatible APIs. 3. **Main Contributions**: - **Problem Definition**: Formalize the Android compatibility problem as the problem of detecting incompatible APIs. - **Framework Design**: Design and implement a unified framework that uses LLMs to identify semantically incompatible APIs. - **Effectiveness Verification**: Through experiments, it is proved that the discovered incompatible APIs can help existing tools (such as CiD) detect a wider range of compatibility problems, especially those caused by semantic changes. 4. **Specific Methods**: - **Information Extraction**: Extract API signatures, API bodies, comments, and annotations from AOSP. - **Signature - Incompatible API Detection**: Detect the addition and removal of APIs by comparing the API signature lists of adjacent versions. - **Semantic - Incompatible API Detection**: For APIs whose signatures remain unchanged but whose implementations have changed, use LLMs to analyze code changes and determine whether they will lead to semantic incompatibility. - **Heuristic Comparison**: Guide LLMs to analyze code changes through prompt engineering to identify potential types of semantic incompatibility. 5. **Experimental Results**: - This method achieved an F1 score of 0.83 on the manually constructed benchmark dataset. - A total of 5,481 incompatible APIs were detected in all versions (from 4 to 33), of which 92.3% were caused by semantic changes, supplementing the compatibility problems that existing tools failed to detect. In conclusion, this paper significantly improves the ability to detect semantically incompatible APIs by introducing a method that combines LLMs and static analysis, thereby better solving the compatibility problems in Android application development.