Abstract:Background: Under the paradigm of precision medicine (PM), patients with the same disease can receive different personalized therapies according to their clinical and genetic features. These therapies are determined by the totality of all available clinical evidence, including results from case reports, clinical trials, and systematic reviews. However, it is increasingly difficult for physicians to find such evidence from scientific publications, whose size is growing at an unprecedented pace. Objective: In this work, we propose the PM-Search system to facilitate the retrieval of clinical literature that contains critical evidence for or against giving specific therapies to certain cancer patients. Methods: The PM-Search system combines a baseline retriever that selects document candidates at a large scale and an evidence reranker that finely reorders the candidates based on their evidence quality. The baseline retriever uses query expansion and keyword matching with the ElasticSearch retrieval engine, and the evidence reranker fits pretrained language models to expert annotations that are derived from an active learning strategy. Results: The PM-Search system achieved the best performance in the retrieval of high-quality clinical evidence at the Text Retrieval Conference PM Track 2020, outperforming the second-ranking systems by large margins (0.4780 vs 0.4238 for standard normalized discounted cumulative gain at rank 30 and 0.4519 vs 0.4193 for exponential normalized discounted cumulative gain at rank 30). Conclusions: We present PM-Search, a state-of-the-art search engine to assist the practicing of evidence-based PM. PM-Search uses a novel Bidirectional Encoder Representations from Transformers for Biomedical Text Mining-based active learning strategy that models evidence quality and improves the model performance. Our analyses show that evidence quality is a distinct aspect from general relevance, and specific modeling of evidence quality beyond general relevance is required for a PM search engine.

EVIDENCEMINER: Textual Evidence Discovery for Life Sciences

Biomedical Evidence Generation Engine

Biomedical evidence engineering for data-driven discovery

BPMiner: mining developers' behavior patterns from screen-captured task videos.

SciMiner: web-based literature mining tool for target identification and functional enrichment analysis

Scientific Discourse Tagging for Evidence Extraction

State-of-the-Art Evidence Retriever for Precision Medicine: Algorithm Development and Validation

MinerU: An Open-Source Solution for Precise Document Content Extraction

OVERVIEW OF TEXT-MINING IN LIFE-SCIENCES

MedMiner: An Internet Text-Mining Tool for Biomedical Information, with Application to Gene Expression Profiling

An Ensemble Semantic Textual Similarity Measure Based on Multiple Evidences for Biomedical Documents

Extracting Information for Meaningful Function Inference through Text-Mining

SciEv: Finding Scientific Evidence Papers for Scientific News

Triangulating evidence in health sciences with Annotated Semantic Queries

Aceso: PICO-Guided Evidence Summarization on Medical Literature

Aliababa DAMO Academy at TREC Precision Medicine 2020: State-of-the-art Evidence Retriever for Precision Medicine with Expert-in-the-loop Active Learning

ENQUIRE RECONSTRUCTS AND EXPANDS CONTEXT-SPECIFIC CO-OCCURRENCE NETWORKS FROM BIOMEDICAL LITERATURE

Aceso-DSAL: Discovering Clinical Evidences from Medical Literature Based on Distant Supervision and Active Learning

SL-Miner: a web server for mining evidence and prioritization of cancer-specific synthetic lethality

Evidence Inference 2.0: More Data, Better Models

SEER: Self-Aligned Evidence Extraction for Retrieval-Augmented Generation