Order-free Medicine Combination Prediction with Graph Convolutional Reinforcement Learning

Shanshan Wang,Pengjie Ren,Zhumin Chen,Zhaochun Ren,Jun Ma,Maarten de Rijke
DOI: https://doi.org/10.1145/3357384.3357965
2019-01-01
Abstract:Medicine Combination Prediction (MCP) based on Electronic Health Record (EHR) can assist doctors to prescribe medicines for complex patients. Previous studies on MCP either ignore the correlations between medicines (i.e., MCP is formulated as a binary classifcation task), or assume that there is a sequential correlation between medicines (i.e., MCP is formulated as a sequence prediction task). The latter is unreasonable because the correlations between medicines should be considered in an order-free way. Importantly, MCP must take additional medical knowledge (e.g., Drug-Drug Interaction (DDI)) into consideration to ensure the safety of medicine combinations. However, most previous methods for MCP incorporate DDI knowledge with a post-processing scheme, which might undermine the integrity of proposed medicine combinations. In this paper, we propose a graph convolutional reinforcement learning model for MCP, named Combined Order-free Medicine Prediction Network (CompNet), that addresses the issues listed above. CompNet casts the MCP task as an order-free Markov Decision Process (MDP) problem and designs a Deep Q Learning (DQL) mechanism to learn correlative and adverse interactions between medicines. Specifcally, we frst use a Dual Convolutional Neural Network (Dual-CNN) to obtain patient representations based on EHRs. Then, we introduce the medicine knowledge associated with predicted medicines to create a dynamic medicine knowledge graph, and use a Relational Graph Convolutional Network (R-GCN) to encode it. Finally, CompNet selects medicines by fusing the combination of patient information and the medicine knowledge graph. Experiments on a benchmark dataset, i.e., MIMIC-III, demonstrate that CompNet signifcantly outperforms state-of-the-art methods and improves a recently proposed model by 3.74%pt, 6.64%pt in terms of Jaccard and F1 metrics.
What problem does this paper attempt to address?