Identifying Composite Crosscutting Concerns Through Semi-Supervised Learning
Jianlin Zhu,Jin Huang,Daicui Zhou,Federico Carminati,Guoping Zhang,Qiang He
DOI: https://doi.org/10.1002/spe.2234
2013-01-01
Abstract:Aspect mining improves the modularity of legacy software systems through identifying their underlying crosscutting concerns (CCs). However, a realistic CC is a composite one that consists of CC seeds and relative program elements, which makes it a great challenge to identify a composite CC. In this paper, inspired by the state-of-the-art information retrieval techniques, we model this problem as a semi-supervised learning problem. First, the link analysis technique is adopted to generate CC seeds. Second, we construct a coupling graph, which indicates the relationship between CC seeds. Then, we adopt community detection technique to generate groups of CC seeds as constraints for semi-supervised learning, which can guide the clustering process. Furthermore, we propose a semi-supervised graph clustering approach named constrained authority-shift clustering to identify composite CCs. Two measurements, namely, similarity and connectivity, are defined and seeded graph is generated for clustering program elements. We evaluate constrained authority-shift clustering on numerous software systems including large-scale distributed software system. The experimental results demonstrate that our semi-supervised learning is more effective in detecting composite CCs. Copyright (c) 2013 John Wiley & Sons, Ltd.