A Gaussian Mixture Model for Dialogue Generation with Dynamic Parameter Sharing Strategy.

Qingqing Zhu,Pengfei Wu,Zhouxing Tan,Jiaxin Duan,Fengyu Lu,Junfei Liu
DOI: https://doi.org/10.1109/icassp43922.2022.9747775
2022-01-01
Abstract:Existing dialog models are trained with data in an encoder-decoder framework with the same parameters, ignoring the multinomial distribution nature in the dataset. In fact, model improvement and development commonly requires fine-grained modeling on individual data subsets. However, collecting a labeled fine-grained dialogue dataset often requires expert-level domain knowledge and therefore is difficult to scale in the real world. As we focus on better modeling multinomial data for dialog generation, we study an approach that combines the unsupervised clustering and generative model together with a GMM (Gaussian Mixture Model) based encoder-decoder framework. Specifically, our model samples from the prior and recognition distributions over the latent variables by a Gaussian mixture network and the latent layer with the capability to form multiple clusters. We also introduce knowledge distillation to guide and improve the clustering results. Finally, we use a dynamic parameter sharing strategy conditioned on different labels to train different decoders. Experimental results on a widely used dialogue dataset verify the effectiveness of the proposed method.
What problem does this paper attempt to address?