Vision–language AI assistance in human pathology

Iris Marchal
DOI: https://doi.org/10.1038/s41587-024-02326-9
IF: 46.9
2024-07-19
Nature Biotechnology
Abstract:Pathologists have access to increasingly accurate artificial intelligence (AI) models that can make task-specific or agnostic predictions based on images or genomic data, but a multimodal AI pilot tailored to pathology is still missing. Writing in Nature , Lu et al. introduce PathChat, a generative AI model that can handle both visual and natural language inputs, as a copilot for human pathology. The AI model architecture integrates a previously developed vision encoder pretrained on over 100 million histology images with a 13-billion-parameter pretrained large language model. The model was fine-tuned using a dataset of over 450,000 instructions to construct PathChat. The authors assessed PathChat's performance using multiple-choice diagnostics and open-ended questions, comparing outcomes with available general-purpose AI assistance, including the best performing commercial model, GPT4V, which powers ChatGPT4. PathChat outperformed previous models on all tasks. Interestingly, adding clinical context such as patient age, sex, clinical history and radiology findings improved PathChat's accuracy on multiple-choice questions from 78.1% to 89.5%, showing that non-visual information can support more accurate diagnosis of histology images without needing specialized data processing.
biotechnology & applied microbiology
What problem does this paper attempt to address?