Just-In-Time Obsolete Comment Detection and Update.
Zhongxin Liu,Xin Xia,David Lo,Meng Yan,Shanping Li
DOI: https://doi.org/10.1109/tse.2021.3138909
IF: 7.4
2021-01-01
IEEE Transactions on Software Engineering
Abstract:Comments are valuable resources for the development, comprehension and maintenance of software. However, while changing code, developers sometimes neglect the evolution of the corresponding comments, resulting in obsolete comments. Such obsolete comments can mislead developers and introduce bugs in the future, and are therefore detrimental. We notice that by detecting and updating obsolete comments in time with code changes, obsolete comments can be effectively reduced and even avoided. We refer to this task as Just-In-Time (JIT) Obsolete Comment Detection and Update. In this work, we propose a two-stage framework named CUP$^\mathrm{2}$2 ( Two -stage C omment UP dater) to automate this task. CUP $^\mathrm{2}$ consists two components, i.e., an O bsolete C omment D etector named OCD and a C omment UP dater named CUP , each of which relies on a distinct neural network model to perform detection (updates). Specifically, given a code change and a corresponding comment, CUP $^\mathrm{2}$ first leverages OCD to predict whether this comment should be updated. If the answer is yes, CUP will be used to generate the new version of the comment automatically. To evaluate CUP $^\mathrm{2}$ , we build a large-scale dataset with over 4 million code-comment change samples. Our dataset focuses on method-level code changes and updates on method header comments considering the importance and widespread use of such comments. Evaluation results show that 1) both OCD and CUP outperform their baselines by significant margins, and 2) CUP $^\mathrm{2}$ performs better than a rule-based baseline. Specifically, the comments generated by CUP $^\mathrm{2}$ are identical to the ground truth for 41.8% of the samples that are predicted to be positive by OCD. We believe CUP $^\mathrm{2}$ can help developers detect obsolete comments, better understand where and how to update obsolete comments and reduce their edits on obsolete comment updates.