Knowledge-Driven Online Multimodal Automated Phenotyping System
Xin Xiong,Sara Morini Sweet,Molei Liu,Chuan Hong,Clara-Lea Bonzel,Vidul Ayakulangara Panickan,Doudou Zhou,Linshanshan Wang,Lauren Costa,Yuk-Lam Ho,Alon Geva,Kenneth D. Mandl,Suchun Cheng,Zongqi Xia,Kelly Cho,J. Michael Gaziano,Katherine P. Liao,Tianxi Cai,Tianrun Cai,Xiong,X.,Sweet,S. M.,Liu,M.,Hong,C.,Bonzel,C.-L.,Ayakulangara Panickan,V.,Zhou,D.,Wang,L.,Costa,L.,Ho,Y.-L.,Geva,A.,Mandl,K. D.,Cheng,S.-C.,Xia,Z.,Cho,K.,Gaziano,J. M.,Liao,K. P.,Cai,T.,Cai,T.
DOI: https://doi.org/10.1101/2023.09.29.23296239
2023-10-03
MedRxiv
Abstract:Though electronic health record (EHR) systems are a rich repository of clinical information with large potential, the use of EHR-based phenotyping algorithms is often hindered by inaccurate diagnostic records, the presence of many irrelevant features, and the requirement for a human-labeled training set. In this paper, we describe a knowledge-driven online multimodal automated phenotyping (KOMAP) system that i) generates a list of informative features by an online narrative and codified feature search engine (ONCE) and ii) enables the training of a multimodal phenotyping algorithm based on summary data. Powered by composite knowledge from multiple EHR sources, online article corpora, and a large language model, features selected by ONCE show high concordance with the state-of-the-art AI models (GPT4 and ChatGPT) and encourage large-scale phenotyping by providing a smaller but highly relevant feature set. Validation of the KOMAP system across four healthcare centers suggests that it can generate efficient phenotyping algorithms with robust performance. Compared to other methods requiring patient-level inputs and gold-standard labels, the fully online KOMAP provides a significant opportunity to enable multi-center collaboration.
English Else