A Novel Composite Kernel For Finding Similar Questions In Cqa Services

Jun Wang,Zhoujun Li,Xia Hu,Biyun Hu
DOI: https://doi.org/10.1007/978-3-642-14246-8_59
2010-01-01
Abstract:Finding similar questions in Community Question Answering (CQA) services plays more and more important role in current web and IR applications. The task aims to retrieve historical questions that are similar or relevant to new questions posed by users. However, traditional "bag-of-words" based models would fail to measure the similarity between question sentences, as they usually ignore sequential and syntactic information. In this paper, we propose a novel composite kernel to improve the accuracy in question matching. Our study illustrate that the composite kernel can efficiently capture both lexical semantics and syntactic information in a question sentence by leveraging word sequence kernel, POS tag sequence kernel and syntactic tree kernel. Experimental results on real world datasets show that our proposed method significantly outperforms the state-of-the-art models.
What problem does this paper attempt to address?