A Topic Detection Method Based on Semantic Dependency Distance and PLSA
Yan Chen,Yang,Huisan Zhang,Haiping Zhu,Feng Tian
DOI: https://doi.org/10.1109/cscwd.2012.6221895
2012-01-01
Abstract:Topic detection is a hot topic in the field of text mining. In this paper, focusing on the Chinese interactive text, we explored a novel topic detection method, named SDD-PLSA, which integrates Semantic Dependency Distance (SDD) and PLSA. It not only has the advantages of PLSA, which is an efficient, effective method and is widely used in text mining, but also considers the semantic and syntax information. Thus, the problem of lacking semantic information in PLSA can be avoided. SDD-PLSA has two main steps. The first is using SDD to classify the sentences that have a high similarity in semantics into several groups according to semantic feature extraction of the interactive text. Then, a PLSA classifier is used upon the result of the first step. The experiments show that the accuracy of detection on `love' topic has been improved to 64.8% when using SDD-PLSA, better than 55.4% when using PLSA.