CoTE: A Flexible Method for Joint Learning of Topic and Embedding Models

Bo Zhao,Chunfeng Yuan,Yihua Huang
DOI: https://doi.org/10.1007/978-981-97-2421-5_27
2024-01-01
Abstract:The topic and embedding models are two of the most popular categories of techniques to learn the latent semantics from text. In the topic models, each word is generated according to its global context; while in the embedding models, each word occurrence is measured by surrounding words. Thus it is expected to train the topic and embedding models jointly by utilizing multi-context information to learn better representations. In this paper, we propose a flexible method named CoTE to achieve this goal, which can integrate a variety of the topic and embedding models together. And we design a general 3-stage learning procedure to optimize the parameters of CoTE, which adopts a rotation optimization scheme. We chose and combined two groups of the de-facto topic and embedding models to implement the CoTE-PD and CoTE-LW algorithms. Experimental results show that CoTE achieves accuracy improvements in both individual components.
What problem does this paper attempt to address?