Alibaba DAMO Academy at TREC Clinical Trials 2021: ExploringEmbedding-based First-stage Retrieval with TrialMatcher.

Qiao Jin,Chuanqi Tan,Zhengyun Zhao,Zheng Yuan,Songfang Huang
2021-01-01
Abstract:This paper describes the submissions of Ailbaba DAMO Academy to the TREC 2021 Clinical Trials Track, where the task is to match eligible clinical trials for given patient notes. Our systems follow the standard retrieval-reranking procedure. We propose a novel embeddingbased retrieval model, TrialMatcher, as the retriever. TrialMatcher contains a patient note encoder and a clinical trial encoder pre-trained by 370k clinical trial documents. It retrieves relevant clinical trials based on embedding space distances. We then use different re-rankers to reorder the candidates returned by TrialMatcher. In automatic runs, the re-rankers are trained by a relevant dataset or a synthetic patient-trial relevance dataset. In manual runs, the re-rankers are trained by annotations derived from a human-in-the-loop active learning strategy. Our automatic runs rank the second in all participants on all four metrics. Our manual runs rank the first on one metric, and the second on three other metrics.
What problem does this paper attempt to address?