Transformer with Modified Self-Attention for Flat-Lattice Chinese NER

Xiaojun Bi,Congcong Zhao,Weizheng Qiao
DOI: https://doi.org/10.1117/12.3005002
2023-01-01
Abstract:As a fundamental task in the field of NLP, Chinese NER has attracted many researchers. Recently, the lattice structure supported by the auxiliary lexicon has been replaced by a new flat-lattice structure to accommodate the input of the Transformer encoder. The positional information in the flat-lattice is regarded as an information flow input into the calculation of self-attention. Due to the addition of this information flow, a variant of self-attention is used here. Although the acquisition of the distance feature relies on the positions of two tokens represented as characters or words in the flat-lattice, the current method only considers the interaction of the distance feature on the former token, rather than taking both tokens into account. In this paper, we propose a new variant of self-attention, which not only considers the influence of the distance feature on the former token but also considers the influence of the distance feature on the latter token. Experiments on two open datasets show that our proposed model outperforms other state-of-the-art models.
What problem does this paper attempt to address?