Semi-Supervised Learning of Dialogue Acts Using Sentence Similarity Based on Word Embeddings

Xiaohao Yang,Jia Liu,Zhenfeng Chen,Weilan Wu
DOI: https://doi.org/10.1109/icalip.2014.7009921
2014-01-01
Abstract:This paper describes a methodology for semi-supervised learning of dialogue acts using the similarity between sentences. We suppose that the dialogue sentences with the same dialogue act are more similar in terms of semantic and syntactic information. However, previous work on sentence similarity mainly modeled a sentence as bag-of-words and then compared different groups of words using corpus-based or knowledge-based measurements of word semantic similarity. Novelly, we present a vector-space sentence representation, composed of word embeddings, that is, the related word distributed representations, and these word embeddings are organised in a sentence syntactic structure. Given the vectors of the dialogue sentences, a distance measurement can be well-defined to compute the similarity between them. Finally, a seeded k-means clustering algorithm is implemented to classify the dialogue sentences into several categories corresponding to particular dialogue acts. This constitutes the semi-supervised nature of the approach, which aims to ameliorate the reliance of the availability of annotated corpora. Experiments with Switchboard Dialog Act corpus show that classification accuracy is improved by 14%, compared to the state-of-art methods based on Support Vector Machine.
What problem does this paper attempt to address?