Abusive Language Detection in Online User Content

Chikashi Nobata,Joel Tetreault,Achint Thomas,Yashar Mehdad,Yi Chang
DOI: https://doi.org/10.1145/2872427.2883062
2016-04-11
Abstract:Detection of abusive language in user generated online content has become an issue of increasing importance in recent years. Most current commercial methods make use of blacklists and regular expressions, however these measures fall short when contending with more subtle, less ham-fisted examples of hate speech. In this work, we develop a machine learning based method to detect hate speech on online user comments from two domains which outperforms a state-of-the-art deep learning approach. We also develop a corpus of user comments annotated for abusive language, the first of its kind. Finally, we use our detection tool to analyze abusive language over time and in different settings to further enhance our knowledge of this behavior.
What problem does this paper attempt to address?