Abstract:<h3 class="u-h4 u-margin-m-top u-margin-xs-bottom">Background and objective</h3><p>Question answering (QA), the identification of short accurate answers to users questions written in natural language expressions, is a longstanding issue widely studied over the last decades in the open-domain. However, it still remains a real challenge in the biomedical domain as the most of the existing systems support a limited amount of question and answer types as well as still require further efforts in order to improve their performance in terms of precision for the supported questions. Here, we present a semantic biomedical QA system named SemBioNLQA which has the ability to handle the kinds of yes/no, factoid, list, and summary natural language questions.</p><h3 class="u-h4 u-margin-m-top u-margin-xs-bottom">Methods</h3><p>This paper describes the system architecture and an evaluation of the developed end-to-end biomedical QA system named SemBioNLQA, which consists of question classification, document retrieval, passage retrieval and answer extraction modules. It takes natural language questions as input, and outputs both short precise answers and summaries as results. The SemBioNLQA system, dealing with four types of questions, is based on (1) handcrafted lexico-syntactic patterns and a machine learning algorithm for question classification, (2) PubMed search engine and UMLS similarity for document retrieval, (3) the BM25 model, stemmed words and UMLS concepts for passage retrieval, and (4) UMLS metathesaurus, BioPortal synonyms, sentiment analysis and term frequency metric for answer extraction.</p><h3 class="u-h4 u-margin-m-top u-margin-xs-bottom">Results and conclusion</h3><p>Compared with the current state-of-the-art biomedical QA systems, SemBioNLQA, a fully automated system, has the potential to deal with a large amount of question and answer types. SemBioNLQA retrieves quickly users' information needs by returning exact answers (e.g., "yes", "no", a biomedical entity name, etc.) and ideal answers (i.e., paragraph-sized summaries of relevant information) for yes/no, factoid and list questions, whereas it provides only the ideal answers for summary questions. Moreover, experimental evaluations performed on biomedical questions and answers provided by the BioASQ challenge especially in 2015, 2016 and 2017 (as part of our participation), show that SemBioNLQA achieves good performances compared with the most current state-of-the-art systems and allows a practical and competitive alternative to help information seekers find exact and ideal answers to their biomedical questions. The SemBioNLQA source code is publicly available at <a href="https://github.com/sarrouti/sembionlqa">https://github.com/sarrouti/sembionlqa</a>.</p>

PubMedQA: A Dataset for Biomedical Research Question Answering

RealMedQA: A pilot biomedical question answering dataset containing realistic clinical questions

Biomedical Question Answering: A Survey of Approaches and Challenges

ScienceQA: a novel resource for question answering on scholarly articles

Generating Biomedical Question Answering Corpora from Q&A forums

Huatuo-26M, a Large-scale Chinese Medical QA Dataset

SemBioNLQA: A semantic biomedical question answering system for retrieving exact and ideal answers to natural language questions

RJUA-QA: A Comprehensive QA Dataset for Urology

Development of an Extractive Clinical Question Answering Dataset with Multi-Answer and Multi-Focus Questions

SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers

BioTABQA: Instruction Learning for Biomedical Table Question Answering

Top K Relevant Passage Retrieval for Biomedical Question Answering

Optimized Biomedical Question-Answering Services with LLM and Multi-BERT Integration

Bio-AnswerFinder: a system to find answers to questions from biomedical texts

emrQA-msquad: A Medical Dataset Structured with the SQuAD V2.0 Framework, Enriched with emrQA Medical Information

What Disease Does This Patient Have? A Large-Scale Open Domain Question Answering Dataset from Medical Exams

Improving Health Question Answering with Reliable and Time-Aware Evidence Retrieval

Study on Question Answering System for Biomedical Domain

OQA: A question-answering dataset on orthodontic literature

Unsupervised Pre-training for Biomedical Question Answering