Named Entity Recognition for Long COVID Biomedical Literature by Using Bert-BiLSTM-IDCNN-ATT-CRF Approach

Zongwang Han,Shaofu Lin,Zhisheng Huang,Chaohui Guo
DOI: https://doi.org/10.1145/3644116.3644319
2023-01-01
Abstract:In recent years, with the exploration of pathological mechanisms and treatments of Long COVID, there has been a dramatic increase in related scientific publications. Effective extraction of key information from these texts is of great importance for public health and research progress. In the Long COVID context, Named Entity Recognition (NER) can be used to identify disease names as well as symptoms, which can help to analyze the sequelae caused by COVID-19 and its relationship with other diseases. Distinguished from molecular biomedical text mining, which focuses on the identification of entities such as genes, proteins, and chemistries and their relationships, Long COVID text mining faces problems such as the lack of publicly labeled datasets and the heavy workload of manual annotation. Moreover, due to the strong domain characteristics of Long COVID relevant named entities, models and methods that have achieved great performance in the generic domain will have significantly degraded named entity recognition performance on this domain. Based on the above problems, we constructed a Long COVID literature abstract NER dataset (LNER) and proposed a Long COVID biomedical literature NER model Bert-BiLSTM-IDCNN-ATT-CRF (BBIAC). First, the BERT-BiLSTM-CRF model is constructed on the LNER dataset. Then, the inflated convolutional neural network (IDCNN) is added between the BiLSTM and the CRF layers to obtain the local features in the text sequences. Finally, feature enhancement is performed by fusing the features of global and local information using the attention mechanism. The experimental results show that the method proposed in this paper for Long COVID literature can accurately extract the characteristic information of Long COVID symptoms and diseases, and has better performance compared to other baseline models.
What problem does this paper attempt to address?