Human-Computer Interactive Chinese Word Segmentation: an Adaptive Dirichlet Process Mixture Model Approach.

Tongfei Chen,Xiaojun Zou,Weimeng Zhu,Junfeng Hu
2013-01-01
Abstract:Previous research shows that Kalman filter based human-computer interactive Chinese word segmentation achieves an encouraging effect in reducing user interventions, but suffers from the drawback of incompetence in distinguishing segmentation ambiguities. This paper proposes a novel approach to handle this problem by using an adaptive Dirichlet process mixture model. By adjusting the hyperparameters of the model, ideal classifiers can be generated to conform to the interventions provided by the users. Experiments reveal that our approach achieves a notable improvement in handling segmentation ambiguities. With knowledge learnt from users, our model outperforms the baseline Kalman filter model by about 0.5% in segmenting homogeneous texts.
What problem does this paper attempt to address?