Abstract:To address the problem of poor entity recognition performance caused by the lack of Chinese annotation in clinical electronic medical records, this paper proposes a multi-medical entity recognition method F-MNER using a fusion technique combining BART, Bi-LSTM, and CRF. First, after cleaning, encoding, and segmenting the electronic medical records, the obtained semantic representations are dynamically fused using a bidirectional autoregressive transformer (BART) model. Then, sequential information is captured using a bidirectional long short-term memory (Bi-LSTM) network. Finally, the conditional random field (CRF) is used to decode and output multi-task entity recognition. Experiments are performed on the CCKS2019 dataset, with micro avg Precision , macro avg Recall , weighted avg Precision reaching 0.880, 0.887, and 0.883, and micro avg F1-score , macro avg F1-score , weighted avg F1-score reaching 0.875, 0.876, and 0.876 respectively. Compared with existing models, our method outperforms the existing literature in three evaluation metrics ( micro average , macro average , weighted average ) under the same dataset conditions. In the case of weighted average, the Precision , Recall , and F1-score are 19.64%, 15.67%, and 17.58% higher than the existing BERT-BiLSTM-CRF model respectively. Experiments are performed on the actual clinical dataset with our MF-MNER, the Precision , Recall , and F1-score are 0.638, 0.825, and 0.719 under the micro-avg evaluation mechanism. The Precision , Recall , and F1-score are 0.685, 0.800, and 0.733 under the macro-avg evaluation mechanism. The Precision , Recall , and F1-score are 0.647, 0.825, and 0.722 under the weighted avg evaluation mechanism. The above results show that our method MF-MNER can integrate the advantages of BART, Bi-LSTM, and CRF layers, significantly improving the performance of downstream named entity recognition tasks with a small amount of annotation, and achieving excellent performance in terms of recall score, which has certain practical significance. Source code and datasets to reproduce the results in this paper are available at https://github.com/xfwang1969/MF-MNER.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the poor performance of entity recognition in Chinese clinical electronic medical records due to the lack of labeled data. Specifically, the paper proposes a multi - model fusion method (MF - MNER), which combines BART, Bi - LSTM and CRF techniques to improve the ability to accurately recognize medical entities in Chinese clinical electronic medical records with a small amount of labeled data. ### Main contributions: 1. **Multi - model fusion**: By fusing the advantages of BART, Bi - LSTM and CRF models, the performance of entity recognition is improved. 2. **Dynamically fusing semantic representations**: The BART model is used to dynamically fuse the semantic representations of input texts, enhancing the model's context - understanding ability. 3. **Capturing sequence information**: The Bi - LSTM network is utilized to capture sequence information, further improving the model's context - modeling ability. 4. **Multi - task entity recognition**: The CRF layer is used to decode and output multi - task entity recognition results, improving the overall recognition accuracy and recall rate. ### Experimental results: - On the CCKS2019 dataset, the micro - avg precision, macro - avg recall, and weighted - avg precision reached 0.880, 0.887 and 0.883 respectively. - The micro - avg F1 - score, macro - avg F1 - score, and weighted - avg F1 - score were 0.875, 0.876 and 0.876 respectively. - Compared with the existing BERT - BiLSTM - CRF model, under the weighted - avg condition, the precision, recall rate and F1 - score were increased by 19.64%, 15.67% and 17.58% respectively. ### Practical application tests: - The test results on the actual clinical dataset show that under the micro - avg condition, the precision, recall rate and F1 - score were 0.638, 0.825 and 0.719 respectively; under the macro - avg condition, the precision, recall rate and F1 - score were 0.685, 0.800 and 0.733 respectively; under the weighted - avg condition, the precision, recall rate and F1 - score were 0.647, 0.825 and 0.722 respectively. ### Conclusion: The MF - MNER method proposed in the paper can effectively integrate the advantages of BART, Bi - LSTM and CRF layers, significantly improve the performance of entity recognition tasks in Chinese clinical electronic medical records with a small amount of labeled data, especially performs well in terms of recall rate, and has certain practical value.

MF-MNER: Multi-models Fusion for MNER in Chinese Clinical Electronic Medical Records

Clinical Named Entity Recognition from Chinese Electronic Medical Records Based on Deep Learning Pretraining

Improving Biomedical Named Entity Recognition with a Unified Multi-Task MRC Framework

Distinct phenotypes and genotypes of debrisoquine hydroxylation among Europeans and Chinese.

A weakly supervised method for named entity recognition of Chinese electronic medical records

Chinese Medical Named Entity Recognition Based on Fusion of Global Features and Multi-Local Features

Chinese Medical Named Entity Recognition using CRF-MT-Adapt and NER-MRC

Multi-task learning for Chinese clinical named entity recognition with external knowledge

Research on Chinese medical named entity recognition based on collaborative cooperation of multiple neural network models

Chinese Medical Entity Recognition Model Based on Character and Word Vector Fusion

Natural products containing a nitrogen-nitrogen bond.

Named Entity Recognition for Chinese Electronic Medical Records Based on Multitask and Transfer Learning

Chinese EMR Named Entity Recognition Using Fused Label Relations Based on Machine Reading Comprehension Framework

MSFM: Multi-view Semantic Feature Fusion Model for Chinese Named Entity Recognition.

Cirrhosis of the liver. A study of 84 cases with emphasis on the clinical, biochemical and radiological aspects.

MFF-CNER: A Multi-feature Fusion Model for Chinese Named Entity Recognition in Finance Securities

Named entity recognition of Chinese electronic medical rec-ords based on adversarial training and feature fusion

Leveraging Part-of-Speech Tagging Features and a Novel Regularization Strategy for Chinese Medical Named Entity Recognition

Multiple Embeddings Enhanced Multi-Graph Neural Networks for Chinese Healthcare Named Entity Recognition

Chinese Clinical Named Entity Recognition with ALBERT and MHA Mechanism