Abstract:Harmful and inappropriate online content is prevalent, necessitating the need to understand how individuals judge and wish to mitigate the spread of negative content on social media. In an online study with a diverse sample of social media users (n = 294), we sought to elucidate factors that influence individuals' evaluation of objectionable online content. Participants were presented with images varying in moral valence, each accompanied by an indicator of intention from an ostensible content poster. Half of the participants were assigned the role of user content moderator, while the remaining participants were instructed to respond as they normally would online. The study aimed to establish whether moral imagery, the intention of a content poster, and the perceived responsibility of social media users, affect judgments of objectionability, operationalized through both decisions to flag content and preferences to seek punishment of other users. Our findings reveal that moral imagery strongly influences users' assessments of what is appropriate online content, with participants almost exclusively choosing to report and punish morally negative images. Poster intention also plays a significant role in user's decisions, with greater objection shown to morally negative content when it has been shared by another user for the purpose of showing support for it. Bestowing a content moderation role affected reporting behaviour but not punishment preferences. We also explore individual user characteristics, finding a negative association between trust in social media platforms and reporting decisions. Conversely, a positive relationship was identified between trait empathy and reporting rates. Collectively, our insights highlight the complexity of social media users' moderation decisions and preferences. The results advance understanding of moral judgments and punishment preferences online, and offer insights for platforms and regulatory bodies aiming to better understand social media users' role in content moderation.

Not Judging a User by Their Cover: Understanding Harm in Multi-Modal Processing within Social Media Research

Technological Solutions to Online Toxicity: Potential and Pitfalls

How We Define Harm Impacts Data Annotations: Explaining How Annotators Distinguish Hateful, Offensive, and Toxic Comments

From Perils to Possibilities: Understanding how Human (and AI) Biases affect Online Fora

A Keyword Based Approach to Understanding the Overpenalization of Marginalized Groups by English Marginal Abuse Models on Twitter

Designing Toxic Content Classification for a Diversity of Perspectives

Moral judgment of objectionable online content: Reporting decisions and punishment preferences on social media

But Who Protects the Moderators? The Case of Crowdsourced Image Moderation

Content Moderation Justice and Fairness on Social Media: Comparisons Across Different Contexts and Platforms

Insights on Disagreement Patterns in Multimodal Safety Perception across Diverse Rater Groups

Collective moderation of hate, toxicity, and extremity in online discussions

Whose Opinions Matter? Perspective-aware Models to Identify Opinions of Hate Speech Victims in Abusive Language Detection

Harmful YouTube Video Detection: A Taxonomy of Online Harm and MLLMs as Alternative Annotators

Human Perception of LLM-generated Text Content in Social Media Environments

Personalizing Content Moderation on Social Media: User Perspectives on Moderation Choices, Interface Design, and Labor

An Empirical Study of Metrics to Measure Representational Harms in Pre-Trained Language Models

Disentangling Perceptions of Offensiveness: Cultural and Moral Correlates

"It's Not Just Hate'': A Multi-Dimensional Perspective on Detecting Harmful Speech Online

Critical Perspectives: A Benchmark Revealing Pitfalls in PerspectiveAPI

Modeling Political Orientation of Social Media Posts: An Extended Analysis

Analyzing Toxicity in Open Source Software Communications Using Psycholinguistics and Moral Foundations Theory