Abstract:Abstract Introduction: The purpose of this study is to develop custom, open source, LLM enabled pipelines and agents for the extraction and standardization of data from free text medical records. Procedures: This project aims to create a versatile tool for healthcare data analysis, with initial development utilizing our extensive collection (over 10,000 documents) of renal cell carcinoma pathology reports. These reports are an ideal starting point as they cover over 20 years of care and consist of a racially and ethnically diverse cohort with over a quarter of patients being from traditionally underrepresented backgrounds. The jargon-abundant and highly technical nature of kidney cancer pathology represents a major challenge. We will accomplish this objective using an approach that balances the harnessing advanced AI capabilities with ensuring practicality and efficiency in real-world applications. First, we will leverage the capabilities of GPT-3.5/4 to generate a comprehensive labeled dataset (referred to as pseudo labels - machine-applied labels that will be used for training a newmodel) tailored for pathology, radiology, and clinician reports of kidney cancer patients. Second, we will harness the pseudo labeled datasets to train a smaller, open-source, yet highly efficient model for our specific needs (termed model distillation). LLAMA 2, known for its high performance in diverse tasks, is our model of choice. Its open-source nature and relatively modest computational requirements make it ideal for deployment within medical institutions. The ability to run locally ensures the protection of patient health information and reduces reliance on costlier, proprietary models. The rationale is threefold: first, to benefit from the superior performance and reasoning capabilities of GPT models; second, to create a streamlined model that is resource-efficient and tailored to our specific requirements; third, to enable the final streamlined model to be freely available by utilizing open source LLMs (e.g., Llama 2). To assess the feasibility of our methodology, we evaluated the ability of GPT-3.5 to label our data with high accuracy. Data Summary: Our pilot set of pathology reports consisted of 109 documents and 3 extraction fields: histology type, biopsy site, and biopsy procedure. Using GPT 3.5 we achieved an accuracy of over 95% in each extracted field type thereby demonstrating its ability to act as a proficient “teacher” model for our future open source “student” model. Conclusions: Our preliminary results suggest that current state of the art LLMs are highly proficient in extracting and standardizing information from clinical reports. Building out a full, open source, LLM enabled pipeline will increase the accuracy, flexibility, and efficiency of medical data extraction across institutions and lead to quicker and more informed decision-making in patient care and research. Citation Format: David Hein, Alana Christie, Hua Zhong, Ellen Araj, James Brugarolas, Lindsay Cowell, Payal Kapur, Andrew Jamieson. Learning Llama Agents for medical record analysis and standardization [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 7390.

BURExtract-Llama: An LLM for Clinical Concept Extraction in Breast Ultrasound Reports

Fine-Tuning In-House Large Language Models to Infer Differential Diagnosis from Radiology Reports

From Text to Tables: A Local Privacy Preserving Large Language Model for Structured Information Retrieval from Medical Documents

Privacy-preserving large language models for structured medical information retrieval

LLM-AIx: An open source pipeline for Information Extraction from unstructured medical text based on privacy pre-serving Large Language Models

LLM-AIx: An open source pipeline for Information Extraction from unstructured medical text based on privacy preserving Large Language Models

LLM-RadJudge: Achieving Radiologist-Level Evaluation for X-Ray Report Generation

Development of a privacy preserving large language model for automated data extraction from thyroid cancer pathology reports

Language Models and Retrieval Augmented Generation for Automated Structured Data Extraction from Diagnostic Reports

Harnessing Large Language Models for Structured Reporting in Breast Ultrasound: A Comparative Study of Open AI (GPT-4.0) and Microsoft Bing (GPT-4)

Leveraging Professional Radiologists' Expertise to Enhance LLMs' Evaluation for Radiology Reports

Human-level information extraction from clinical reports with fine-tuned language models

Enhancing Clinical Data Extraction from Pathology Reports: A Comparative Analysis of Large Language Models

An Entity Extraction Pipeline for Medical Text Records Using Large Language Models: Analytical Study

MGH Radiology Llama: A Llama 3 70B Model for Radiology

Extraction and classification of structured data from unstructured hepatobiliary pathology reports using large language models: a feasibility study compared with rules-based natural language processing

Abstract 7390: Learning Llama Agents for medical record analysis and standardization

Automated Extraction of Patient-Centered Outcomes After Breast Cancer Treatment: An Open-Source Large Language Model-Based Toolkit

Empowering PET Imaging Reporting with Retrieval-Augmented Large Language Models and Reading Reports Database: A Pilot Single Center Study

Automated Clinical Data Extraction with Knowledge Conditioned LLMs

Assessing Large Language Models for Oncology Data Inference from Radiology Reports