Sentence Alignment for Biomedicine Texts Based on Gaussian Mixture Model

陈相,林鸿飞,杨志豪
DOI: https://doi.org/10.3969/j.issn.1003-0077.2010.04.010
2010-01-01
Abstract:A bilingual lexicon of biomedical terms plays an important role in biomedical cross-language information retrieval.Sentence alignment is the first step to build a bilingual lexicon.The Gaussian mixture model and transfer learning are applied to align sentences.The basic idea is to consider the sentence alignment as a classification task,which can be solved by the Gaussian mixture model classifiers based on the anchor information included in medical literature abstracts.At the same time,the sentence alignment model is built by combining biomedicine literature abstracts with New Concept English corpora,and it aims at applying transfer learning to train the length features and transfer them to the model.The experiments show it improves the performance of the sentence alignment model.
What problem does this paper attempt to address?