Vector-Quantization-Based Topic Modeling

Amulya Gupta,Zhu Zhang
DOI: https://doi.org/10.1145/3450946
IF: 5
2021-04-22
ACM Transactions on Intelligent Systems and Technology
Abstract:With the purpose of learning and utilizing explicit and dense topic embeddings, we propose three variations of novel vector-quantization-based topic models (VQ-TMs): (1) Hard VQ-TM, (2) Soft VQ-TM, and (3) Multi-View Soft VQ-TM. The model family capitalize on vector quantization techniques, embedded input documents, and viewing words as mixtures of topics. Guided by a comprehensive set of evaluation metrics, we conduct systematic quantitative and qualitative empirical studies, and demonstrate the superior performance of VQ-TMs compared to important baseline models. Through a unique case study on code generation from natural language descriptions, we further illustrate the power of VQ-TMs in downstream tasks.
computer science, information systems, artificial intelligence
What problem does this paper attempt to address?