Improved Viterbi Algorithm-Based HMM2 for Chinese Words Segmentation

Lei La,Qiao Guo,Dequan Yang,Qimin Cao
DOI: https://doi.org/10.1109/ICCSEE.2012.249
2012-01-01
Abstract:In order to solve problems caused by the individualism of Chinese architecture more and more researchers focus on Hybrid and improved Hidden Markov Model. However, as the foundation of Chinese natural language processing, studies on Chinese words segmentation based on Second-order Hidden Markov Model (HMM2) are not abundant. A words frequency weighted smoothing method and a Threshold-Viterbi algorithm are proposed and combined to build a Improved Viterbi Algorithm-based HHM2(IV-HMM2) model in this article to overcome the sparse problem and improve the accuracy. Experimental rusults demonstrate that the improved model has better performance and lower overhead than traditional HMM2.
What problem does this paper attempt to address?