Has My Release Disobeyed Semantic Versioning? Static Detection Based on Semantic Differencing
Lyuye Zhang,Chengwei Liu,Zhengzi Xu,Sen Chen,Lingling Fan,Bihuan Chen,Yang Liu
DOI: https://doi.org/10.1145/3551349.3556956
2022-01-01
Abstract:To enhance the compatibility in the version control of Java Third-party Libraries (TPLs), Maven adopts Semantic Versioning (SemVer) to standardize the underlying meaning of versions, but users could still confront abnormal execution and crash after upgrades even if compilation and linkage succeed. It is caused by semantic breaking (SemB) issues, such that APIs directly used by users have identical signatures but inconsistent semantics across upgrades. To strengthen compliance with SemVer rules, developers and users should be alerted of such issues. Unfortunately, it is challenging to detect them statically, because semantic changes in the internal methods of APIs are difficult to capture. Dynamic testing can confirmingly uncover some, but it is limited by inadequate coverage. To detect SemB issues over compatible upgrades (Patch and Minor) by SemVer rules, we conduct an empirical study on 180 SemB issues to understand the root causes, inspired by which, we propose SEMBID (Semantic Breaking Issue Detector) to statically detect such issues of TPLs for developers and users. Since APIs are directly used by users, SEMBID detects and reports SemB issues based on APIs. For a pair of APIs, SEMBID walks through the call chains originating from the API to locate breaking changes by measuring semantic diff. Then, SEMBID checks if the breaking changes can affect API's output along call chains. The evaluation showed SEMBID achieved 90.26% recall and 81.29% precision and outperformed other API checkers on SemB API detection. We also revealed SEMBID detected over 3 times more SemB APIs with better coverage than unit tests, the commonly used solution. Furthermore, we carried out an empirical study on 1, 629, 589 APIs from 546 version pairs of top Java libraries and found there were 2 similar to 4 times more SemB APIs than those with signature-based issues. Due to various version release strategies, 33.83% of Patch version pairs and 64.42% of Minor version pairs had at least one API affected by any breaking.