MIMO: Mutual Integration of Patient Journey and Medical Ontology for Healthcare Representation Learning
Xueping Peng,Guodong Long,Tao Shen,Sen Wang,Zhendong Niu,Chengqi Zhang
2021-01-01
Abstract:Healthcare representation learning on the ElectronicHealth Records (EHRs) is crucial for downstream medical prediction tasks in health informatics. Many natural language processing techniques, such as word2vec, RNN and self-attention, have been adapted to learn medical representations from hierarchical and time-stamped EHRs data, but fail when they lack either general or task-specific data. Hence, some recent works train healthcare representations by incorporating medical ontology (a.k.a. knowledge graph), by selfsupervised tasks like diagnosis prediction, but (1) the small-scale, monotonous ontology is insufficient for robust learning, and (2) critical contexts or dependencies underlying patient journeys are barely exploited to enhance ontology learning. To address the challenges, we propose a Transformer-based representation learning approach:Mutual Integration of Patient journey and medical Ontology (MIPO), which is a robust end-to-end framework. Specifically, the proposed method focuses on task-specific representation learning by a sequential diagnoses predictive task, which is also beneficial to the ontology-based disease typing task. To integrate information in the patient’s visiting records, we further introduce a graph-embedding module, which can mitigate the challenge of data insufficiency in healthcare. In this way, MIPO creates a mutual integration to benefit both healthcare representation learning and medical ontology embedding. Such an effective integration is guaranteed by joint training over fused embeddings of the two modules, targeting both task-specific prediction and ontology-based disease typing tasks simultaneously. Extensive experiments conducted on two real-world benchmark datasets have shown MIPO consistently achieve better performance than state-of-the-art methods no matter whether the training data is sufficient or not. Also, MIPO derives more interpretable diagnose embedding results compared to its counterparts. ∗ Corresponding author. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. KDD ’22, August 14–18, 2022, Washington, DC © 2022 Association for Computing Machinery. ACM ISBN 978-1-4503-XXXX-X/18/06. . . $15.00 https://doi.org/XXXXXXX.XXXXXXX CCS CONCEPTS • Information systems→Data mining; •Applied computing → Health informatics.