Empowering Biomedical Named Entity Recognition Through Multi-Tagger Collaboration
Jin Zhao,Jian Xie,Tinghui Zhu,Qian Guo,Zhixu Li,Yanghua Xiao
DOI: https://doi.org/10.3233/faia240611
2024-01-01
Abstract:Biomedical Named Entity Recognition (BioNER) plays a crucial role in automatically identifying specific categories of entities from biomedical texts. Currently, region-based methods have shown promising performance in BioNER. However, existing paradigms in the region-based methods suffer from inherent limitations, including the generation of negative samples, and the ignorance of token dependencies. To overcome these limitations, we propose a new paradigm, implemented as Token Cascade Tagger (TCT), which combines span identification and category classification. The TCT utilizes category information to enhance the correlation between the heads and tails of entities, effectively reducing the generation of negative samples. Additionally, we introduce a Token Dependency Tagger (TDT) that captures token dependencies within entity spans by identifying the longest span in a sentence. The TDT filters out incorrect spans and further improves the accuracy of span detection obtained from the TCT. Furthermore, we employ a multi-task learning framework to optimize both the TCT and TDT, leading to superior performance in BioNER. Extensive experiments on publicly available biomedical datasets demonstrate our method outperforms the previous state-of-the-art methods, achieving 92.44%, 92.54%, and 81.26% on NCBI-Disease, BC5CDR, and GENIA, respectively, in terms of F1 score.