Extracting clinical concepts from user queries

Yue Zhao,John Handley
DOI: https://doi.org/10.48550/arXiv.1912.06262
2019-12-24
Abstract:Clinical concept extraction often begins with clinical Named Entity Recognition (NER). Often trained on annotated clinical notes, clinical NER models tend to struggle with tagging clinical entities in user queries because of the structural differences between clinical notes and user queries. User queries, unlike clinical notes, are often ungrammatical and incoherent. In many cases, user queries are compounded of multiple clinical entities, without comma or conjunction words separating them. By using as dataset a mixture of annotated clinical notes and synthesized user queries, we adapt a clinical NER model based on the BiLSTM-CRF architecture for tagging clinical entities in user queries. Our contribution are the following: 1) We found that when trained on a mixture of synthesized user queries and clinical notes, the NER model performs better on both user queries and clinical notes. 2) We provide an end-to-end and easy-to-implement framework for clinical concept extraction from user queries.
Information Retrieval,Computation and Language,Machine Learning
What problem does this paper attempt to address?