CNN-BLSTM Based Question Detection from Dialogs Considering Phase and Context Information.

Yuke Si,Longbiao Wang,Jianwu Dang,Mengfei Wu,Aijun Li
DOI: https://doi.org/10.21437/interspeech.2019-1701
2019-01-01
Abstract:Question detection from dialogs is important in human-computer interaction systems. Recent studies on question detection mostly use recurrent neural network (RNN) based methods to process low-level descriptors (LLD) of the utterance. However, there are three main problems in these studies. Firstly, traditional LLD features are defined based on human a priori knowledge, some of which are difficult to be extracted accurately. Secondly, previous studies of question detection only consider features from amplitude information and ignored phase information. Thirdly, previous studies show that the context in an utterance is helpful to detect question, while the context between utterances is not well investigated in this task. To cope with the aforementioned problems, we propose a CNN-BLSTM based framework, where amplitude information is obtained from the combination of spectrogram and LLD, and processed together with the phase information. Our framework also models the context information in the dialog. From the experiments on Mandarin dialog corpus, we revealed the effectiveness of the integrated feature with both amplitude and phase in question detection. The results indicated that the phase feature was helpful to detect the questions with a short duration, and the context between utterances was beneficial to detect questions without special interrogative forms.
What problem does this paper attempt to address?