Untangling Composite Commits by Attributed Graph Clustering

Siyu Chen,Shengbin Xu,Yuan Yao,Feng Xu
DOI: https://doi.org/10.1145/3545258.3545267
2022-01-01
Abstract:During software development, it is considered to be a best practice if each commit represents one distinct concern, such as fixing a bug or adding a new feature. However, developers may not always follow this practice and sometimes tangle multiple concerns into a single composite commit. This makes automatic commit untangling a necessary task, and recent approaches mainly untangle commits via applying graph clustering on the code dependency graph. In this paper, we propose a new commit untangling approach, ComUnt, to decompose the composite commits into atomic ones. Different from existing approaches, ComUnt is built upon the observation that both the textual content of code statements and the dependencies between code statements contain useful semantic information so as to better comprehend the committed code changes. Based on this observation, ComUnt first constructs an attributed graph for each commit, where code statements and various code dependencies are modeled as nodes and edges, respectively, and the textual body of code statements are maintained as node attributes. It then conducts attributed graph clustering on the constructed graph. The used attributed graph clustering algorithm can simultaneously encode both graph structure and node attributes so as to better separate the code changes into clusters with distinct concerns. We evaluate our approach on nine C# projects, and the experimental result shows that ComUnt improves the state-of-the-art by 7.8% in terms of untangling accuracy, and meanwhile it is more than 6 times faster.
What problem does this paper attempt to address?