AutoPM3: Enhancing Variant Interpretation via LLM-driven PM3 Evidence Extraction from Scientific Literature

Shumin Li,Yiding Wang,Chi-man Liu,Yuanhua Huang,Tak-wah Lam,Ruibang Luo
DOI: https://doi.org/10.1101/2024.10.29.621006
2024-11-03
Abstract:Rare diseases, affecting 300 million people globally, often result from genetic variants. Whole-genome sequencing has made variant detection more cost-effective, but interpreting these variants remains challenging. Current clinical practice combines quantitative evidence and literature, which is complex and time-consuming. We introduce AutoPM3, a method for automating the extraction of ACMG/AMP PM3 evidence from scientific literature using open-source LLMs. It combines an optimized RAG system for text comprehension and a TableLLM equipped with Text2SQL for data extraction. We evaluated AutoPM3 using our collected PM3-Bench, a dataset from ClinGen with 1,027 variant-publication pairs. AutoPM3 significantly outperformed other methods in variant hit and in trans variant identification, thanks to the four key modules. Additionally, we wrapped AutoPM3 with a user-friendly interface to enhance its accessibility. This study presents a powerful tool to improve rare disease diagnosis workflows by facilitating PM3-relevant evidence extraction from scientific literature.
Bioinformatics
What problem does this paper attempt to address?