A Method of Automatic Annotation for Medical Record Text Based on Latent Dirichlet Allocation

Xinyu Jin,Qiliang Jin,Yuze Li
DOI: https://doi.org/10.2991/icectt-15.2015.58
2015-01-01
Abstract:With the rapid development of medical information, medical data, especially medical record text, are difficult to intelligent analyses, because these data have loose grammar structure. Latent semantic analysis technology in the field of text mining in recent years made extensive research and application, and Latent Dirichlet Allocation(LDA), put forward by Blei, is a method to solve those difficulties. This paper proposed an improved LDA based on BM25 mixture weights method to analyze Chinese medical record text and had a good performance.
What problem does this paper attempt to address?