Prioritizing code review requests to improve review efficiency: a simulation study

Jia, Junyu,Xue, Junming
DOI: https://doi.org/10.1007/s10664-024-10575-0
IF: 3.762
2024-11-13
Empirical Software Engineering
Abstract:Code review has become a common quality assurance process in modern software development. For large-scale, active software projects emphasizing continuous delivery and fast feedback, one of the main challenges with code review is prioritizing the many Code Review Requests (CRRs) these projects receive. Many heuristic rules and machine learning models have been adopted to develop CRR prioritizers. However, their evaluation was completed before starting code reviews. Such a pre-review evaluation provides few indications about how prioritizing CRRs influences code review efficiency. In this paper, we conduct a simulation study that aims (1) to perform a post-review evaluation on CRR prioritizers, and (2) to evaluate the influence on review efficiency resulting from positions where to pick CRRs, as developers have different review preferences. To achieve the first goal, we use discrete-event simulation-based software process simulation modeling methods to simulate the prioritization-centric code review processes, and we propose new evaluation metrics from the perspectives of completion and delivery to measure code review efficiency. We develop nine prioritizers and use historical review orders as baselines for the post-review evaluation. The experiments using 15 GitHub projects show that (1) prioritizing CRRs helps complete open CRRs and helps deliver closed CRRs, and (2) Random Forest is the best-performing prioritizer. To achieve the second goal, we devise nine positions where a CRR is permitted to be picked from an ordered list every time, and we use Random Forest as the basic prioritizer for the post-review evaluation. The results show that picking a CRR from the top-1 position is better than other positions. Overall, through this simulation study, we provide software organizations with (1) the new evaluation method (post-review evaluation) and metrics (completion and delivery) on CRR prioritizers, and (2) the optimal prioritizer (Random Forest) and position (top-1) to improve code review efficiency.
computer science, software engineering
What problem does this paper attempt to address?