Analyzing the Correlation Between Toxic Comments and Code Quality

Jaime Sayago‐Heredia,Gustavo Chango Sailema,Ricardo Pérez‐Castillo,Mario Piattini
DOI: https://doi.org/10.1002/smr.2739
2024-11-17
Journal of Software Evolution and Process
Abstract:Software development has a significant human component that can affect quality and productivity. This study analyzes the effects of toxic emotions on code quality using the SentiQ tool, which extracts sentiments in commit messages and relates them to measures of code quality. A correlation between toxic comments and code quality has been demonstrated, leading to a predictive model that estimates quality based on these comments. In the future, this could allow estimating code quality based on toxic developer comments. Software development has a relevant human side, and this could, for example, imply that developers' feelings have an impact on certain aspects of software development such as quality, productivity, or performance. This paper explores the effects of toxic emotions on code quality and presents the SentiQ tool, which gathers and analyzes sentiments from commit messages (obtained from GitHub) and code quality measures (obtained from SonarQube). The SentiQ tool we proposed performs a sentiment analysis (based on natural language processing techniques) and relates the results to the code quality measures. The datasets extracted are then used as the basis on which to conduct a preliminary case study, which demonstrates that there is a relationship between toxic comments and code quality that may affect the quality of the whole software project. This has resulted in the drafting of a predictive model to validate the correlation of the impact of toxic comments on code quality. The main implication of this work is that these results could, in the future, make it possible to estimate code quality as a function of developers' toxic comments.
computer science, software engineering
What problem does this paper attempt to address?