Rich Features Based Conditional Random Fields for Biological Named Entities Recognition

Chengjie Sun,Yi Guan,Xiaolong Wang,Lei Lin
DOI: https://doi.org/10.1016/j.compbiomed.2006.12.002
IF: 7.7
2007-01-01
Computers in Biology and Medicine
Abstract:Biological named entity recognition is a critical task for automatically mining knowledge from biological literature. In this paper, this task is cast as a sequential labeling problem and Conditional Random Fields model is introduced to solve it. Under the framework of Conditional Random Fields model, rich features including literal, context and semantics are involved. Among these features, shallow syntactic features are first introduced, which effectively improve the model's performance. Experiments show that our method can achieve an F-measure of 71.2% in an open evaluation data, which is better than most of state-of-the-art systems.
What problem does this paper attempt to address?