Abstract:Background: Extractive question-answering (EQA) is a useful natural language processing (NLP) application for answering patient-specific questions by locating answers in their clinical notes. Realistic clinical EQA can yield multiple answers to a single question and multiple focus points in 1 question, which are lacking in existing data sets for the development of artificial intelligence solutions. Objective: This study aimed to create a data set for developing and evaluating clinical EQA systems that can handle natural multianswer and multifocus questions. Methods: We leveraged the annotated relations from the 2018 National NLP Clinical Challenges corpus to generate an EQA data set. Specifically, the 1-to-N, M-to-1, and M-to-N drug-reason relations were included to form the multianswer and multifocus question-answering entries, which represent more complex and natural challenges in addition to the basic 1-drug-1-reason cases. A baseline solution was developed and tested on the data set. Results: The derived RxWhyQA data set contains 96,939 QA entries. Among the answerable questions, 25% of them require multiple answers, and 2% of them ask about multiple drugs within 1 question. Frequent cues were observed around the answers in the text, and 90% of the drug and reason terms occurred within the same or an adjacent sentence. The baseline EQA solution achieved a best F1-score of 0.72 on the entire data set, and on specific subsets, it was 0.93 for the unanswerable questions, 0.48 for single-drug questions versus 0.60 for multidrug questions, and 0.54 for the single-answer questions versus 0.43 for multianswer questions. Conclusions: The RxWhyQA data set can be used to train and evaluate systems that need to handle multianswer and multifocus questions. Specifically, multianswer EQA appears to be challenging and therefore warrants more investment in research. We created and shared a clinical EQA data set with multianswer and multifocus questions that would channel future research efforts toward more realistic scenarios.

emrQA: A Large Corpus for Question Answering on Electronic Medical Records

Annotating Electronic Medical Records for Question Answering

Question Answering for Electronic Health Records: A Scoping Review of datasets and models

A Question-Answering System over Traditional Chinese Medicine

emrQA-msquad: A Medical Dataset Structured with the SQuAD V2.0 Framework, Enriched with emrQA Medical Information

A Question-Entailment Approach to Question Answering

MedREQAL: Examining Medical Knowledge Recall of Large Language Models via Question Answering

What Disease Does This Patient Have? A Large-Scale Open Domain Question Answering Dataset from Medical Exams

Knowledge Graph-based Question Answering with Electronic Health Records

RealMedQA: A pilot biomedical question answering dataset containing realistic clinical questions

XAIQA: Explainer-Based Data Augmentation for Extractive Question Answering

EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images

Development of an Extractive Clinical Question Answering Dataset with Multi-Answer and Multi-Focus Questions

ECG-QA: A Comprehensive Question Answering Dataset Combined With Electrocardiogram

Generating Biomedical Question Answering Corpora from Q&A forums

A Question Answering Based Pipeline for Comprehensive Chinese EHR Information Extraction

Question Answering System Based on the Big Data of Electronic Medical Records (EMR)

Extractive Clinical Question-Answering With Multianswer and Multifocus Questions: Data Set Development and Evaluation Study

A medical question answering system using large language models and knowledge graphs

AmQA: Amharic Question Answering Dataset

ChroniclingAmericaQA: A Large-scale Question Answering Dataset based on Historical American Newspaper Pages