Predicting Which Pull Requests Will Get Reopened in GitHub

Abdillah Mohamed,Li Zhang,Jing Jiang,Ahmed Ktob
DOI: https://doi.org/10.1109/apsec.2018.00052
2018-01-01
Abstract:In GitHub, integrators inspect submitted code changes, make evaluation decision, and close pull requests. However, some pull requests may get reopened for further modification and code review. It is important to predict reopened pull requests immediately after pull requests' first close, and help integrators reopen pull requests in time. If pull requests are reopened a long time after their close, they may cause conflicts with newly submitted pull requests, add software maintenance cost, and increase burden for already busy developers. To the best of our knowledge, we present the first look at predicting reopened pull requests in GitHub. We propose an approach DTPre which is an automatic predictor of reopened pull requests based on Decision Tree classifier. DTPre mainly analyzes code features of modified changes, review features during evaluation, and developer feature of contributors. We evaluate the effectiveness of DTPre on 7 Open Source projects containing 100,622 pull requests. Experimental results show that DTPre has high performances by achieving a precision of 95.53%, recall of 99.01% and F1-measure of 97.23% on average. In comparison with predictors based on neural network, naïve Bayes, logistic regression and SVM, DTPre based on decision tree improves F-1 measures by 41.76%, 59.45%, 42.25% and 9.98% on average across 7 projects.
What problem does this paper attempt to address?