Teaching Text Classification Models Some Common Sense Via Q&A Statistics: A Light and Transplantable Approach

Hanqing Tao,Guanqi Zhu,Tong Xu,Qi Liu,Enhong Chen
DOI: https://doi.org/10.1007/978-3-031-17120-8_46
2022-01-01
Abstract:Sociolinguistics believes that common sense is the key factor for people to ensure the robustness of language understanding and text classification in complex social environments. However, commonsense facts are often vague and implicit, and hard to be formalized or combined with current text classification techniques. As an opportunity for us, the Q&A community is a socialized platform for people to communicate knowledge, and implicit common sense can be more easily reflected by the distribution of keywords contained in Q&A data. To this end, our key insight in this paper is to obtain implicit common sense by innovatively exploiting statistical keyword distributions from annotated Q&A corpora to present a Statistics-based Label Interactive Model (SLIM) as an enhancement framework for current deep learning text classification models. Specifically, we first draw inspirations from the sociality shared by Q&A community and commonsense knowledge, and propose a light pre-classification strategy. Then, to exploit the valuable semantics of Q&A domain labels, we devise a commonsense attention module to enhance the textual representation. Afterwards, we design a label interactive attention module to make our methods transplantable to any other deep learning models and text classification datasets. Finally, extensive experiments with several deep learning text classification models on three datasets exactly verify the rationality and effectiveness of our methods.
What problem does this paper attempt to address?