A prelinary investigation on using multi-task learning to predict change performance in code reviews

Lanxin Yang,He Zhang,Jinwei Xu,Jun Lyu,Xin Zhou,Dong Shao,Shan Gao,Alberto Bacchelli
DOI: https://doi.org/10.1007/s10664-024-10526-9
IF: 3.762
2024-09-29
Empirical Software Engineering
Abstract:The various performances of a change in code reviews have received growing concerns from software organizations and researchers. Researchers have investigated these aspects in isolation from one another (e.g., predicting the merge approval of a change after review), but this approach provides limited value for decision-making (e.g., decomposing composite changes). Although developing multiple task-specific models to address this problem is possible, training and deploying multiple models can be time- and cost-consuming. In this paper, we propose a multi-task learning (MTL) approach that leverages a single model to predict a set of performances simultaneously, including iteration , duration , score , and merge approval . Considering the absence of MTL models for this problem in code reviews, we draw inspiration from the domain of recommender systems, which exhibits active research on feature engineering and model training for MTL. By adapting and refining the structure of MTL models employed in recommender systems, we aim to identify the most suitable model to form our approach. Our approach incorporates four groups of features (project experience, author experience, reviewer experience, and code change) to represent a change and deploys eight MTL models including Shared-Bottom, OMoE, MMoE, ESMM, SNR, CGC, PLE, and AITM. We evaluate our approach using data from four large open-source projects (Eclipse, LibreOffice, OpenDaylight, and OpenStack), comprising more than 661 thousand changes within the Gerrit community. The experimental results demonstrate that (1) ESMM exhibits the best performance, (2) ESMM outperforms single-task learning models, (3) ESMM could be influenced by code change-related features in most cases, and (4) ESMM could be used for cross-project change prediction but its performance decreases slightly. Overall, as the first attempt to apply multi-task learning to predict change performance in code reviews, our approach has shown promising results but still requires improvement.
computer science, software engineering
What problem does this paper attempt to address?