Sentiment Analysis of ML Projects: Bridging Emotional Intelligence and Code Quality

Md Shoaib Ahmed,Dongyoung Park,Nasir U. Eisty
2024-09-26
Abstract:This study explores the intricate relationship between sentiment analysis (SA) and code quality within machine learning (ML) projects, illustrating how the emotional dynamics of developers affect the technical and functional attributes of software projects. Recognizing the vital role of developer sentiments, this research employs advanced sentiment analysis techniques to scrutinize affective states from textual interactions such as code comments, commit messages, and issue discussions within high-profile ML projects. By integrating a comprehensive dataset of popular ML repositories, this analysis applies a blend of rule-based, machine learning, and hybrid sentiment analysis methodologies to systematically quantify sentiment scores. The emotional valence expressed by developers is then correlated with a spectrum of code quality indicators, including the prevalence of bugs, vulnerabilities, security hotspots, code smells, and duplication instances. Findings from this study distinctly illustrate that positive sentiments among developers are strongly associated with superior code quality metrics manifested through reduced bugs and lower incidence of code smells. This relationship underscores the importance of fostering positive emotional environments to enhance productivity and code craftsmanship. Conversely, the analysis reveals that negative sentiments correlate with an uptick in code issues, particularly increased duplication and heightened security risks, pointing to the detrimental effects of adverse emotional conditions on project health.
Software Engineering
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to explore the complex relationship between sentiment analysis (SA) and code quality in machine learning (ML) projects. Specifically, the research aims to evaluate how these emotions affect the quality and functional properties of software projects by analyzing the emotional states in developers' text interactions (such as code comments, commit messages, and issue discussions). The core questions of the paper are: 1. **How do developers' emotions affect the code quality of machine - learning projects?** - The research explores this relationship by quantifying developers' emotion scores and correlating them with code quality metrics (such as error rates, vulnerabilities, security hotspots, code smells, and duplicate code instances). 2. **What is the association between positive emotions and high - quality code?** - The research finds that positive emotions among developers are significantly associated with higher code quality metrics, manifested as fewer errors and a lower incidence of code smells. 3. **What is the impact of negative emotions on project health?** - The research reveals the association between negative emotions and an increase in code problems, especially an increase in duplicate code and security risks, indicating the negative impact of adverse emotional conditions on project health. Through the exploration of these questions, the paper hopes to provide a basis for enhancing the positive emotional environment of developers, thereby improving productivity and code quality. At the same time, the research also emphasizes the importance of considering emotions and moods in machine - learning projects to improve teamwork and communication.