Abstract:Hui Lin, 1, 2 Lisa Ni, 1 Christina Phuong, 1 Julian C Hong 1, 3, 4 1 Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA; 2 UC Berkeley-UCSF Graduate Program in Bioengineering, University of California, Berkeley and San Francisco, San Francisco, CA, USA; 3 Bakar Computational Health Sciences Institute, University of California, San Francisco, CA, USA; 4 Joint Program in Computational Precision Health, University of California, Berkeley and San Francisco, Berkeley, CA, USA Correspondence: Julian C Hong, Email Natural language processing (NLP), a technology that translates human language into machine-readable data, is revolutionizing numerous sectors, including cancer care. This review outlines the evolution of NLP and its potential for crafting personalized treatment pathways for cancer patients. Leveraging NLP's ability to transform unstructured medical data into structured learnable formats, researchers can tap into the potential of big data for clinical and research applications. Significant advancements in NLP have spurred interest in developing tools that automate information extraction from clinical text, potentially transforming medical research and clinical practices in radiation oncology. Applications discussed include symptom and toxicity monitoring, identification of social determinants of health, improving patient-physician communication, patient education, and predictive modeling. However, several challenges impede the full realization of NLP's benefits, such as privacy and security concerns, biases in NLP models, and the interpretability and generalizability of these models. Overcoming these challenges necessitates a collaborative effort between computer scientists and the radiation oncology community. This paper serves as a comprehensive guide to understanding the intricacies of NLP algorithms, their performance assessment, past research contributions, and the future of NLP in radiation oncology research and clinics. Keywords: artificial intelligence, personalized medicine, radiation therapy, natural language processing Natural Language Processing (NLP), a critical domain in artificial intelligence (AI), has revolutionized a myriad of general language applications, from search engines and recommendation systems to digital personal assistants. 1–3 In the context of healthcare, NLP holds considerable promise, especially in the field of oncology. With the proliferation of Electronic Health Records (EHRs), a wealth of unstructured data is available for exploration. In fact, it is estimated that the US healthcare system has exceeded 2000 exabytes, much of which is unstructured data in clinical notes, demanding sophisticated NLP techniques for utilization. 4 This vast reservoir of EHR data has catalyzed research efforts to unearth meaningful insights for cancer care. For instance, NLP techniques have been employed to identify patients at risk of familial cancers using family history information documented in clinical narratives, 5 to automate the process of extracting cancer staging information from unstructured clinical narratives, 6 leading to more efficient patient stratification and appropriate treatment plans. This horizon is now expanding towards radiation oncology, where personalization and precision play pivotal roles. The remainder of this review presents a comprehensive exploration of the evolution of NLP models and discuss the potential implications of these developments in the context of radiation oncology, more specifically in personalizing treatment pathways. To illustrate the dynamic interplay between data, models, and applications of NLP in radiation oncology, Figure 1 provides a schematic overview that encapsulates the progression from raw clinical data to actionable oncological insights, highlighting the transformative potential of NLP in personalizing radiation therapy pathways. Figure 1 A schematic overview of the flow from foundational data to diverse applications in the radiation oncology domain empowered by NLP methods. The bottom layer represents various foundational data sources used in radiation oncology. The middle layer categorizes the predominant NLP methodologies into three classes: knowledge-based, statistical, and deep learning, where knowledge-based methods rely on domain-specific rules, statistical methods employ algorithms to infer patterns from data, and deep learning utilizes complex neural network architectures for more nuanced language understanding. The top layer displays the key applications of these NLP methods in radiation oncology. The rudimentary phase of NLP, dating back to the 1950s, was dominated by rule-based systems. 7 The -Abstract Truncated-

Anatomic Pathology Information Laboratory Information Systems and Natural Language Processing: Early History

An accessible, efficient, and accurate natural language processing method for extracting diagnostic data from pathology reports

Natural Language Processing to extract SNOMED-CT codes from pathological reports

The history of pathology informatics: A global perspective

Use of contextual inquiry to understand anatomic pathology workflow: Implications for digital pathology adoption

Application of digital technology in the work of a pathologist: guidelines for learning how to use speech recognition systems

Structuring data in pathology reports: overcoming challenges with new tools

Natural Language Processing Technologies in Radiology Research and Clinical Applications.

Evaluating Methods for Identifying Cancer in Free-Text Pathology Reports Using Various Machine Learning and Data Preprocessing Approaches

Automated Generation of Synoptic Reports from Narrative Pathology Reports in University Malaya Medical Centre Using Natural Language Processing

A survey analysis of the adoption of large language models among pathologists

Preparing Data for Artificial Intelligence in Pathology with Clinical-Grade Performance

Development and Validation of a Natural Language Processing Algorithm for Extracting Clinical and Pathological Features of Breast Cancer From Pathology Reports

Natural Language Processing for Radiation Oncology: Personalizing Treatment Pathways

Application of ChatGPT in Routine Diagnostic Pathology: Promises, Pitfalls, and Potential Future Directions

A systematic review of natural language processing applied to radiology reports

Natural Language Processing in Diagnostic Texts from Nephropathology

Computational pathology: an evolving concept

Artificial intelligence and pathology: From principles to practice and future applications in histomorphology and molecular profiling

PathNarratives: Data Annotation for Pathological Human-Ai Collaborative Diagnosis

Artificial intelligence in diagnostic pathology