Abstract:Background Patient-generated health data (PGHD) captured via smart devices or digital health technologies can reflect an individual health journey. PGHD enables tracking and monitoring of personal health conditions, symptoms, and medications out of the clinic, which is crucial for self-care and shared clinical decisions. In addition to self-reported measures and structured PGHD (eg, self-screening, sensor-based biometric data), free-text and unstructured PGHD (eg, patient care note, medical diary) can provide a broader view of a patient’s journey and health condition. Natural language processing (NLP) is used to process and analyze unstructured data to create meaningful summaries and insights, showing promise to improve the utilization of PGHD. Objective Our aim is to understand and demonstrate the feasibility of an NLP pipeline to extract medication and symptom information from real-world patient and caregiver data. Methods We report a secondary data analysis, using a data set collected from 24 parents of children with special health care needs (CSHCN) who were recruited via a nonrandom sampling approach. Participants used a voice-interactive app for 2 weeks, generating free-text patient notes (audio transcription or text entry). We built an NLP pipeline using a zero-shot approach (adaptive to low-resource settings). We used named entity recognition (NER) and medical ontologies (RXNorm and SNOMED CT [Systematized Nomenclature of Medicine Clinical Terms]) to identify medication and symptoms. Sentence-level dependency parse trees and part-of-speech tags were used to extract additional entity information using the syntactic properties of a note. We assessed the data; evaluated the pipeline with the patient notes; and reported the precision, recall, and F1 scores. Results In total, 87 patient notes are included (audio transcriptions n=78 and text entries n=9) from 24 parents who have at least one CSHCN. The participants were between the ages of 26 and 59 years. The majority were White (n=22, 92%), had more than one child (n=16, 67%), lived in Ohio (n=22, 92%), had mid- or upper-mid household income (n=15, 62.5%), and had higher level education (n=24, 58%). Out of 87 notes, 30 were drug and medication related, and 46 were symptom related. We captured medication instances (medication, unit, quantity, and date) and symptoms satisfactorily (precision >0.65, recall >0.77, F1>0.72). These results indicate the potential when using NER and dependency parsing through an NLP pipeline on information extraction from unstructured PGHD. Conclusions The proposed NLP pipeline was found to be feasible for use with real-world unstructured PGHD to accomplish medication and symptom extraction. Unstructured PGHD can be leveraged to inform clinical decision-making, remote monitoring, and self-care including medical adherence and chronic disease management. With customizable information extraction methods using NER and medical ontologies, NLP models can feasibly extract a broad range of clinical information from unstructured PGHD in low-resource settings (eg, a limited number of patient notes or training data).

Automatic Extraction of Medication Data from Semi-Structured Prescriptions

EXTRACTING INFORMATION FROM CHINESE PRESCRIPTION PHARMACEUTICALS BASED ON NPOS SHORTEST-PATH WORD SEGMENTATION ALGORITHM

Extracting Drug Names and Associated Attributes From Discharge Summaries: Text Mining Study

medExtractR: A targeted, customizable approach to medication extraction from electronic health records

Converting Semi-structured Clinical Medical Records into Information and Knowledge

MADEx: A System for Detecting Medications, Adverse Drug Events, and Their Relations from Clinical Notes

Automated Drug-Related Information Extraction from French Clinical Documents: ReLyfe Approach

Automatic Conversion of Electronic Medical Record Text for OpenEHR Based on Semantic Analysis

Weakly Supervised Medication Regimen Extraction from Medical Conversations

Extracting Daily Dosage from Medication Instructions in EHRs: An Automated Approach and Lessons Learned

A Study of Deep Learning Approaches for Medication and Adverse Drug Event Extraction from Clinical Text.

Hybrid approach combining deep learning and a rule based expert system for concept extraction from prescriptions

Extraction of Geriatric Syndromes From Electronic Health Record Clinical Notes: Assessment of Statistical Natural Language Processing Methods

From Free-text Drug Labels to Structured Medication Terminology with BERT and GPT

Feature-Based Learning in Drug Prescription System for Medical Clinics

Extracting Medical Information From Free-Text and Unstructured Patient-Generated Health Data Using Natural Language Processing Methods: Feasibility Study With Real-world Data

Zero- and Few-shot Named Entity Recognition and Text Expansion in Medication Prescriptions using ChatGPT

Extraction of Information Related to Drug Safety Surveillance From Electronic Health Record Notes: Joint Modeling of Entities and Relations Using Knowledge-Aware Neural Attentive Models (Preprint)

EliXR: an approach to eligibility criteria extraction and representation

Extraction of Information Related to Drug Safety Surveillance From Electronic Health Record Notes: Joint Modeling of Entities and Relations Using Knowledge-Aware Neural Attentive Models

Classifying Characteristics of Opioid Use Disorder From Hospital Discharge Summaries Using Natural Language Processing