Research on the Extraction and Alignment of Tibetan-Chinese Cross-language Topics

Yuan SUN,Qian ZHAO
2017-01-01
Abstract:In contrast to the,To discover synchronication topics associated in Tibetan and Chinese social networking,we build LDA topic model on the basis of Tibetan-Chinese comparable corpus,with word2vec as the input and Gibbs sampling to estimate model parameters.To align Tibetan topics and Chinese topics,we calculate the similarity between Tibetan and Chinese topics according to the distribution of text-topic disctrbution via a voting method based on cosine distance,Euclidean distance,Hellinger distance and KL distance.
What problem does this paper attempt to address?