Abstract:While social media platforms play an important role in our daily lives in obtaining the latest news and trends from across the globe, they are known to be prone to widespread proliferation of harmful information in different forms leading to misconceptions among the masses. Accordingly, several prior works have attempted to tag social media posts with labels/classes reflecting their veracity, sentiments, hate content, etc. However, in order to have a convincing impact, it is important to additionally extract the post snippets on which the labelling decision is based. We call such a post snippet as the 'rationale'. These rationales significantly improve human trust and debuggability of the predictions, especially when detecting misinformation or stigmas from social media posts. These rationale spans or snippets are also helpful in post-classification social analysis, such as for finding out the target communities in hate-speech, or for understanding the arguments or concerns against the intake of vaccines. Also it is observed that a post may express multiple notions of misinformation, hate, sentiment, etc. Thus, the task of determining (one or multiple) labels for a given piece of text, along with the text snippets explaining the rationale behind each of the identified labels is a challenging multi-label, multi-rationale classification task, which is still nascent in the literature. While transformer -based encoder-decoder generative models such as BART and T5 are well-suited for the task, in this work we show how a relatively simpler encoder-only discriminative question-answering (QA) model can be effectively trained using simple template-based questions to accomplish the task. We thus propose MuLX-QA and demonstrate its utility in producing (label, rationale span) pairs in two different settings: multi-class (on the HateXplain dataset related to hate speech on social media), and multi-label (on the CAVES dataset related to COVID-19 anti-vaccine concerns). MuLX-QA outperforms heavier generative models in both settings. We also demonstrate the relative advantage of our proposed model MuLX-QA over strong baselines when trained with limited data. We perform several ablation studies, and experiments to better understand the effect of training MuLX-QA with different question prompts, and draw interesting inferences. Additionally, we show that MuLX-QA is effective on social media posts in resource-poor non-English languages as well. Finally, we perform a qualitative analysis of our model predictions and compare them with those of our strongest baseline.

Teaching Text Classification Models Some Common Sense Via Q&A Statistics: A Light and Transplantable Approach

SBTM: A Joint Sentiment and Behaviour Topic Model for Online Course Discussion Forums

Text Classification Via Learning Semantic Dependency and Association

Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models

A Semantic-based Method for Unsupervised Commonsense Question Answering

Evaluating Commonsense in Pre-trained Language Models

T-SciQ: Teaching Multimodal Chain-of-Thought Reasoning via Mixed Large Language Model Signals for Science Question Answering

Robust Commonsense Reasoning Against Noisy Labels Using Adaptive Correction

Go Beyond Plain Fine-tuning: Improving Pretrained Models for Social Commonsense

A Survey on Text Classification: From Traditional to Deep Learning

Question classification task based on deep learning models with self-attention mechanism

Combined Multiple Classifiers Based on TBL Algorithm and Their Application in Question Classification

Short Text Classification of Chinese with Label Information Assisting

MuLX-QA: Classifying Multi-Labels and Extracting Rationale Spans in Social Media Posts

A Survey on Text Classification: From Shallow to Deep Learning

An attention mechanism and multi-granularity-based Bi-LSTM model for Chinese Q&A system

Social Image-text Sentiment Classification With Cross-Modal Consistency and Knowledge Distillation

Bi-directional Long Short-Term Memory Model with Semantic Positional Attention for the Question Answering System

Temporal Interaction and Causal Influence in Community-Based Question Answering.

A multi‐label social short text classification method based on contrastive learning and improved ml‐KNN

CommonsenseVIS: Visualizing and Understanding Commonsense Reasoning Capabilities of Natural Language Models