A Large-Scale Empirical Study of Real-LifePerformance Issues in Open Source Projects
Yutong Zhao,Lu Xiao,Andre B. Bondi,Bihuan Chen,Yang Liu
DOI: https://doi.org/10.1109/tse.2022.3167628
IF: 7.4
2022-01-01
IEEE Transactions on Software Engineering
Abstract:Software performance is a critical quality attribute that determines the success of a software system. However, practitioners lack comprehensive and holistic understanding of how real-life performance issues are caused and resolved in practice from the technical, engineering, and economic perspectives. This paper presents a large-scale empirical study of 570 real-life performance issues from 13 open source projects from various problem domains, and implemented in three popular programming languages, Java (192 issues), C/C++ (162 issues), and Python (216 issues). From the technical perspective, we summarize eight general types of performance issues with corresponding root causes and resolutions that apply for all three languages. We also identify available tools for detecting and resolving different types of issues from the literature. In addition, we found that 27% of the 570 issues are resolved by design-level optimization—coordinated revision of a group of related source files and their design structure. We reveal four typical design-level optimization patterns, including classic design patterns, change propagation, optimization clone, and parallel optimization that practitioners should be aware of in resolving performance issues. From the engineering perspective, this study analyzes how test code changes in performance optimization. We found that only 15% of the 570 performance issues involve revision of test code. In most cases, the revised test cases focus on the functional logic of the performance optimization, rather than directly evaluate the performance improvement. This finding points to the potential lack of engineering standard for formally verifying performance optimization in regression testing. Finally, from the economic perspective, we analyze the “Return On Investment” of performance optimization. We found that design-level optimization usually requires more investment, but not always yields to higher performance improvement. However, developers tend to use design-level optimization when they concern about other quality attributes, such as maintainability and readability.
engineering, electrical & electronic,computer science, software engineering