Improve on Entity Recognition Method Based on BiLSTM-CRF Model for the Nuclear Technology Knowledge Graph

Yitang Liu,Zhenchuan Wang,Wei Zhao,Yunliang Lan,Yang Liu,Ruixin Shi
DOI: https://doi.org/10.1109/PRAI55851.2022.9904215
2022-08-19
Abstract:The accuracy of entity recognition is particularly important for knowledge graph construction. The traditional Named Entity recognition (NER) model mainly includes HMM, CRF, BiLSTM, BiLSTM-CRF, etc. It is difficult to solve the problem of word meaning confusion resulting from the wrong separation at the end of the entity when using the four models for the labeled nuclear technology knowledge data sets. In order to improve the entity recognition effect and address the problem of polysemy in nuclear technology knowledge data set, an improved nuclear technology entity recognition method based on the BERT-BiLSTM-CRF combination model was proposed by comparative experiments. According to the results, it can conclude that the application of the BERT model instead of the Word2Vec algorithm for word vector training is helpful to the model recognition, and the exclusive dictionary word segmentation and part of speech classification of nuclear technology texts contribute to improving the quality of labeled data. The experiment verified that the usage of BERT pre-training model can solve the problem of polysemy in NER to some extent. Meanwhile, it validated the precision in nuclear technology knowledge entity recognition for collected data, which could be used in pre-procedure in nuclear technology knowledge graph construction.
Engineering,Computer Science
What problem does this paper attempt to address?