Hostility Detection Dataset in Hindi

Mohit Bhardwaj,Md Shad Akhtar,Asif Ekbal,Amitava Das,Tanmoy Chakraborty
DOI: https://doi.org/10.48550/arXiv.2011.03588
2020-11-06
Computation and Language
Abstract:In this paper, we present a novel hostility detection dataset in Hindi language. We collect and manually annotate ~8200 online posts. The annotated dataset covers four hostility dimensions: fake news, hate speech, offensive, and defamation posts, along with a non-hostile label. The hostile posts are also considered for multi-label tags due to a significant overlap among the hostile classes. We release this dataset as part of the CONSTRAINT-2021 shared task on hostile post detection.
What problem does this paper attempt to address?