Automating Responses to Patient Portal Messages Using Generative AI

Amarpreet Kaur,Alex Budko,Katrina Liu,Eric Eaton,Bryan Steitz,Kevin B. Johnson,Kaur,A.,Budko,A.,Liu,K.,Eaton,E.,Steitz,B.,Johnson,K. B.
DOI: https://doi.org/10.1101/2024.04.25.24306183
2024-04-26
MedRxiv
Abstract:Background: Patient portals serve as vital bridges between patients and providers, playing an increasing role in healthcare communication. The rising volume and complexity of these messages is exacerbating physician and nursing burnout. Recent studies have demonstrated that AI chatbots can generate message responses that are viewed favorably by healthcare professionals; however, these studies have not included the diverse range of messages typically found in patient portals. Our goal is to investigate the quality of GPT-generated message responses across the spectrum of message types within a patient portal. Methods: We used novel prompt engineering techniques to craft synthetic responses tailored to adult primary care patients. We enrolled a sample of primary care providers in a cross-sectional study to compare authentic with synthetic patient portal message responses, generated by GPT-4. The survey assessed each messages empathy, relevance, medical accuracy, and readability on a scale from 0 to 5. Respondents were asked to identify messages that were GPT-generated vs. provider-generated. Mean scores for all metrics were computed for subsequent analysis. Results: A total of 49 health care providers participated in the survey (59% completion rate), comprising 16 physicians and 32 advanced practice providers (APPs). When presented with GPT vs. authentic message response pairs, participants correctly identified GPT-generated responses 73% of the time and correctly identified authentic responses 50% of the time. In comparison to messages generated by physicians, GPT-4 generated messages exhibited higher mean scores for empathy (3.57 vs. 3.07, p < 0.001), relevance (3.94 vs. 3.81, p = 0.08) accuracy (4.05 vs. 3.95, p= 0.12) and readability (4.5 vs. 4.13, p < 0.001). Limitations: The study is a single-site, single-specialty study, limited due to the use of synthetic data. Conclusion: Our findings affirm the potential of GPT-generated patient portal message responses to achieve comparable levels of empathy, relevance, and readability to those found in typical responses according to the health care providers and indicates promising prospects for their integration in the healthcare sector. Additional studies should be done within provider workflows and with careful evaluation of patient attitudes and concerns related to the ethics as well as the quality of generated patient portal message responses in all settings.
What problem does this paper attempt to address?