A Survey of CRF Algorithm Based Knowledge Extraction of Elementary Mathematics in Chinese

Shuai Liu,Tenghui He,Jianhua Dai
DOI: https://doi.org/10.1007/s11036-020-01725-x
2021-01-03
Mobile Networks and Applications
Abstract:Chinese word segmentation is an important research direction in related research on elementary mathematics knowledge extraction. The speed of segmentation directly affects subsequent applications, and the accuracy of segmentation directly affects corresponding research in the next step. In the machine learning methods for extracting basic mathematical knowledge points, the Conditional Random Field (CRF) model implements new word discovery well, and is increasingly used in knowledge extraction of basic mathematics. This article first introduces the traditional CRF process of named entity recognition. Then, an improved algorithm CRF++for conditional field model is proposed. Since the recognition rate of named entities based on traditional machine learning methods is not high, a post-processing method for entity recognition that automatically generates a dictionary is proposed. After identifying mathematical entities, a pruning strategy combining Viterbi algorithm and rules is proposed to achieve a higher recognition rate of elementary mathematical entities. Finally, several methods of disambiguation after entity recognition are introduced.
computer science, information systems,telecommunications, hardware & architecture
What problem does this paper attempt to address?