MSR4SM: Using Topic Models to Effectively Mining Software Repositories for Software Maintenance Tasks

Xiaobing Sun,Bixin Li,Hareton Leung,Bin Li,Yun Li
DOI: https://doi.org/10.1016/j.infsof.2015.05.003
IF: 3.9
2015-01-01
Information and Software Technology
Abstract:Context: Mining software repositories has emerged as a research direction over the past decade, achieving substantial success in both research and practice to support various software maintenance tasks. Software repositories include bug repository, communication archives, source control repository, etc. When using these repositories to support software maintenance, inclusion of irrelevant information in each repository can lead to decreased effectiveness or even wrong results.Objective: This article aims at selecting the relevant information from each of the repositories to improve effectiveness of software maintenance tasks.Method: For a maintenance task at hand, maintainers need to implement the maintenance request on the current system. In this article, we propose an approach, MSR4SM, to extract the relevant information from each software repository based on the maintenance request and the current system. That is, if the information in a software repository is relevant to either the maintenance request or the current system, this information should be included to perform the current maintenance task. MSR4SM uses the topic model to extract the topics from these software repositories. Then, relevant information in each software repository is extracted based on the topics.Results: MSR4SM is evaluated for two software maintenance tasks, feature location and change impact analysis, which are based on four subject systems, namely jEdit, ArgoUML, Rhino and KOffice. The empirical results show that the effectiveness of traditional software repositories based maintenance tasks can be greatly improved by MSR4SM.Conclusions: There is a lot of irrelevant information in software repositories. Before we use them to implement a maintenance task at hand, we need to preprocess them. Then, the effectiveness of the software maintenance tasks can be improved. (C) 2015 Elsevier B.V. All rights reserved.
What problem does this paper attempt to address?