Abstract:Studies have shown that toxic behavior can cause contributors to leave, and hinder newcomers' (especially from underrepresented communities) participation in Open Source Software (OSS) projects. Thus, detection of toxic language plays a crucial role in OSS collaboration and inclusivity. Off-the-shelf toxicity detectors are ineffective when applied to OSS communications, due to the distinct nature of toxicity observed in these channels (e.g., entitlement and arrogance are more frequently observed on GitHub than on Reddit or Twitter). In this paper, we investigate a machine learning-based approach for the automatic detection of toxic communications in OSS. We leverage psycholinguistic lexicons, and Moral Foundations Theory to analyze toxicity in two types of OSS communication channels; issue comments and code reviews. Our evaluation indicates that our approach can achieve a significant performance improvement (up to 7% increase in F1 score) over the existing domain-specific toxicity detector. We found that using moral values as features is more effective than linguistic cues, resulting in 67.50% F1-measure in identifying toxic instances in code review data and 64.83% in issue comments. While the detection accuracy is far from accurate, this improvement demonstrates the potential of integrating moral and psycholinguistic features in toxicity detection models. These findings highlight the importance of context-specific models that consider the unique communication styles within OSS, where interpersonal and value-driven language dynamics differ markedly from general social media platforms. Future work could focus on refining these models to further enhance detection accuracy, possibly by incorporating community-specific norms and conversational context to better capture the nuanced expressions of toxicity in OSS environments.

What problem does this paper attempt to address?

This paper attempts to solve the problem of automatically detecting toxic communication in open - source software (OSS) projects. Specifically, the author focuses on the problem that existing general - purpose toxicity detection tools perform poorly when applied to OSS communication, because toxicity expressions in OSS are unique. For example, behaviors such as insults, arrogance, and imperiousness caused by technical differences are more common. These characteristics make existing general - purpose toxicity detection tools unable to effectively identify "covert toxicity" in OSS. ### Core Problems of the Paper 1. **Limitations of Existing Tools**: Existing general - purpose toxicity detection tools (such as Google Perspective API) perform poorly when applied to OSS communication because they fail to capture the language styles and norms specific to OSS. 2. **Toxicity Characteristics in OSS**: Toxicity in OSS is not just simple offensive language, but also includes subtle emotional expressions in technical discussions, such as sarcasm, imperiousness, etc., which are less common on other platforms (such as Reddit or Twitter). 3. **Improving Detection Accuracy**: The author hopes to improve the toxicity detection model in OSS communication by combining psychology and moral theories to improve its accuracy and applicability. ### Solutions To address these problems, the author proposes a machine - learning - based method that uses a psycholinguistic dictionary and the Moral Foundations Theory (MFT) to analyze toxicity in OSS. Specific methods include: - **Psycholinguistic Features**: Use the Linguistic Inquiry and Word Count (LIWC) dictionary to extract the psycholinguistic features of the text, such as "Clout", "Authentic", "Tone", etc. - **Moral Features**: According to MFT, analyze the moral values (such as care/harm, fairness/cheating, authority/subversion, loyalty/betrayal, purity/decadence) in the text as features. In this way, the author hopes to develop a toxicity detection model more suitable for the OSS environment, thereby better supporting the inclusiveness and sustainable development of OSS projects. ### Experimental Results The experimental results show that after combining psycholinguistic and moral features, the performance of the model has been significantly improved, especially in the F1 score, with a maximum improvement of 7%. This indicates that considering the language and moral background specific to OSS is crucial for improving the accuracy of toxicity detection. ### Future Work Future research can further optimize these models, for example, by introducing community - specific norms and dialogue contexts to more accurately capture the subtle manifestations of toxicity in the OSS environment.

Analyzing Toxicity in Open Source Software Communications Using Psycholinguistics and Moral Foundations Theory

Exploring Moral Principles Exhibited in OSS: A Case Study on GitHub Heated Issues

ToxiSpanSE: An Explainable Toxicity Detection in Code Review Comments

Analyzing Toxicity in Deep Conversations: A Reddit Case Study

Exploring ChatGPT for Toxicity Detection in GitHub

Detection of Toxic Language in Short Text Messages

Assessing the Influence of Toxic and Gender Discriminatory Communication on Perceptible Diversity in OSS Projects

Leveraging Large Language Models and Topic Modeling for Toxicity Classification

Impact of Sentiment Detection to Recognize Toxic and Subversive Online Comments

Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection

RECAST: Enabling User Recourse and Interpretability of Toxicity Detection Models with Interactive Visualization

Exploring Human-LLM Conversations: Mental Models and the Originator of Toxicity

Purging the Poison: A Machine Learning Approach to Filtering Toxic Comments

The Constant in HATE: Analyzing Toxicity in Reddit across Topics and Languages

Toxicity Inspector: A Framework to Evaluate Ground Truth in Toxicity Detection Through Feedback

Handling Bias in Toxic Speech Detection: A Survey

Predicting Different Types of Subtle Toxicity in Unhealthy Online Conversations

Mitigating Toxic Degeneration with Empathetic Data: Exploring the Relationship Between Toxicity and Empathy

RECAST: Interactive Auditing of Automatic Toxicity Detection Models

A Taxonomy of Rater Disagreements: Surveying Challenges & Opportunities from the Perspective of Annotating Online Toxicity

Exploring Antecedents and Consequences of Toxicity in Online Discussions: A Case Study on Reddit.