Topic medical concept embedding: Multi-sense representation learning for medical concept

Feng Qian,Chengyue Gong,Lu-chen Liu,Lei Sha,Ming Zhang
DOI: https://doi.org/10.1109/BIBM.2017.8217683
2017-01-01
Abstract:Representation learning algorithm in medical area maps high dimensional real world medical concepts to low dimensional vector space, encodes rich medical knowledge, and has brought improvement to various machine learning applications in medical area. However, previous representation learning models in medical area failed to consider the multi-sense characteristic of medical concept. Moreover, the inner relationships between representations learned by previous model is implicit and can only be explained according to visualization, which means poor interpretability. In this paper, we propose Topic Medical Concept Embedding (TMCE), a generative embedding model to address above two problems. TMCE is able to learn multi-sense representations for a single medical concept, and TMCE can also improve interpretability by modeling relationships between each concept explicitly. In TMCE, multi-sense concept representations are influenced by its contexts and its topics. In addition, dosage information which is ignored by previous work are also utilized in TMCE. A MCMC method is presented to jointly learn the two-layer topic embeddings and multi-sense concept embeddings. Experimental results show that representations learned by TMCE outperforms those learned by other strong baselines by a large margin in a multi-label diagnose classification tasks. Several case studies further show that TMCE can learn medically correct multi-sense representations with better interpretability than other strong baselines.
What problem does this paper attempt to address?