Mitigating Spurious Correlations in Named Entity Recognition Models Through Counterfactual Data Augmentation

Jiawei Liu,Min Huang,Qinghai Miao
DOI: https://doi.org/10.1109/ijcnn60899.2024.10649919
2024-01-01
Abstract:Named Entity Recognition (NER) models perform well on standard benchmarks, but they often lack robustness when dealing with out-of-domain data. Recent studies have highlighted their limitations in genuine sentence comprehension, often due to their reliance on memorized entities or context, leading to spurious correlations. In this paper, we present Structural Causal Model-based techniques for detecting and addressing two types of spurious correlations in NER models: contextual and entity spurious correlations. Additionally, we introduce four data augmentation methods to improve training data and enable NER models to learn contextual and entity information effectively. Experimental results reveal that several NER models suffer from spurious correlations, and our approach mitigates them effectively. Furthermore, in comparison to the baseline methods, our approach demonstrates competitive performance on the in-domain dataset while achieving state-of-the-art results on the out-of-domain dataset.
What problem does this paper attempt to address?