How Developers Modify Pull Requests in Code Review

Jing Jiang,Jiangfeng Lv,Jiateng Zheng,Li Zhang
DOI: https://doi.org/10.1109/tr.2021.3093159
IF: 5.883
2022-01-01
IEEE Transactions on Reliability
Abstract:In pull-based development process, contributors submit their code to open-source projects by pull requests, which are accepted or rejected by reviewers. Contributors may modify their code, which causes several iterations of code review process, and makes code reviews time-consuming for both contributors and reviewers. In this article, we set out to study pull request modifications in a code review process. We collect nine projects on GitHub with 104 307 pull requests, and investigate pull request modifications through analyzing added commits after pull requests’ submission. By studying four research questions, we conclude our major findings as follow. First, 34.56 $\%$ of collected pull requests have modifications. Pull requests with modifications have longer lifetime but higher pass rates. Second, we conclude eight modification types indicating why pull requests are modified. Third, we propose a novel method called MClassify to automatically classify pull request modifications, which achieves the accuracy of 0.807. Fourth, various modification types affect code review differently from the perspective of lifetime and pass rate. Pull requests with source control system management modifications have the longest lifetime. These findings enable developers and researchers to understand a pull-based code review process better and make improvements.
What problem does this paper attempt to address?