Effect of Toxic Review Content on Overall Product Sentiment

Mayukh Mukhopadhyay,Sangeeta Sahney
DOI: https://doi.org/10.48550/arXiv.2201.02857
2022-01-09
Abstract:Toxic contents in online product review are a common phenomenon. A content is perceived to be toxic when it is rude, disrespectful, or unreasonable and make individuals leave the discussion. Machine learning algorithms helps the sell side community to identify such toxic patterns and eventually moderate such inputs. Yet, the extant literature provides fewer information about the sentiment of a prospective consumer on the perception of a product after being exposed to such toxic review content. In this study, we collect a balanced data set of review comments from 18 different players segregated into three different sectors from google play-store. Then we calculate the sentence-level sentiment and toxicity score of individual review content. Finally, we use structural equation modelling to quantitatively study the influence of toxic content on overall product sentiment. We observe that comment toxicity negatively influences overall product sentiment but do not exhibit a mediating effect over reviewer score to influence sector-wise relative rating.
Human-Computer Interaction,Computation and Language,General Economics,Applications
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is: **The impact of toxic comment content on the overall product sentiment**. Specifically, by collecting and analyzing comment data from 18 different applications (divided into three different industry sectors) in the Google Play store, the author studied how the toxic content in the comments affects consumers' overall sentiment towards the product. ### Research Background and Problems 1. **Definition and Identification of Toxic Comments**: - Toxic comments usually refer to those rude, disrespectful or unreasonable contents, which may cause users to leave the discussion. - Machine - learning algorithms can help identify and manage these toxic comments, but the existing literature pays less attention to the emotional changes of consumers towards products after being exposed to toxic comments. 2. **Research Questions**: - **RQ1**: Does the reviewer's rating affect the sentiment and toxicity of the comment content? - **RQ2**: Given the sentiment of the comment, does the comment toxicity affect the overall product sentiment? - **RQ3**: Does the comment toxicity play a significant mediating role between the reviewer's rating and the industry - relative rating? - **RQ4**: Is the overall product sentiment affected by the differences in the research industries? ### Method Overview To answer these questions, the author adopted the following methods: 1. **Data Extraction**: - A web - crawling tool written in Python was used to collect comment data from 18 applications in three different industries (subscription services, medicine and health, tourism) from the Google Play store. - 1,200 comments were collected for each application to ensure the balance of the data set. 2. **Data Enrichment**: - The Perspective API provided by the Google Jigsaw team was used to calculate the toxicity score (toxicity score) of each comment, with a range of 0 to 1. - The sentimentr package in R language was used to calculate the sentence - level sentiment score (sentiment score) of each comment. 3. **Model Evaluation**: - Structural Equation Modeling (SEM) was used to quantitatively study the impact of toxic comments on the overall product sentiment. - Path analysis, measurement model and total effect model were used to analyze the relationships between variables. ### Conclusions The study found that comment toxicity does have a negative impact on the overall product sentiment, but it does not show a mediating effect between the reviewer's rating and the industry - relative rating. In addition, there are indeed differences in the overall product sentiment in different industries. ### Formula Representation - **Toxicity Score**: $T_i\in[0, 1]$, where $i$ represents the $i$-th comment. - **Sentiment Score**: $S_i\in[- 1,1]$, where $i$ represents the $i$-th comment. - **Service Relative Rating (SRR)**: \[ \text{SRR}=R_p\times R_c\] - $R_p$ represents the overall rating of an application. - $R_c$ represents the reviewer's credit value ($R_c = 2$ for applications with more than one million reviews; otherwise $R_c = 1$). Through these methods and analyses, the author provides valuable insights into how toxic comments affect consumer sentiment and also provides new directions for future research.