BioTABQA: Instruction Learning for Biomedical Table Question Answering

Man Luo,Sharad Saxena,Swaroop Mishra,Mihir Parmar,Chitta Baral
DOI: https://doi.org/10.48550/arXiv.2207.02419
2022-07-06
Abstract:Table Question Answering (TQA) is an important but under-explored task. Most of the existing QA datasets are in unstructured text format and only few of them use tables as the context. To the best of our knowledge, none of TQA datasets exist in the biomedical domain where tables are frequently used to present information. In this paper, we first curate a table question answering dataset, BioTABQA, using 22 templates and the context from a biomedical textbook on differential diagnosis. BioTABQA can not only be used to teach a model how to answer questions from tables but also evaluate how a model generalizes to unseen questions, an important scenario for biomedical applications. To achieve the generalization evaluation, we divide the templates into 17 training and 5 cross-task evaluations. Then, we develop two baselines using single and multi-tasks learning on BioTABQA. Furthermore, we explore instructional learning, a recent technique showing impressive generalizing performance. Experimental results show that our instruction-tuned model outperforms single and multi-task baselines on an average by ~23% and ~6% across various evaluation settings, and more importantly, instruction-tuned model outperforms baselines by ~5% on cross-tasks.
Computation and Language,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the deficiencies in the Table Question Answering (TQA) tasks in the biomedical field. Specifically: 1. **Lack of TQA datasets in the biomedical field**: Most existing question - answering datasets are mainly based on unstructured texts, and only a few use tables as context. In the biomedical field, information is usually presented in tabular form, but previously there was no TQA dataset specifically for this field. 2. **Insufficient generalization ability of models**: Many language models perform excellently on popular benchmarks, but in practical applications, especially in the biomedical field, their generalization ability is still limited. Therefore, it is necessary to evaluate and improve the generalization ability of models for unseen question types. To solve these problems, the author has taken the following measures: - **Create the BioTabQA dataset**: By using 22 templates to extract information from a biomedical textbook on differential diagnosis, a new table - question - answering dataset BioTabQA was constructed. This dataset is not only used to train models to answer questions in tables, but also to evaluate the generalization ability of models on unseen tasks. - **Explore Instruction Learning**: Instruction learning techniques were introduced to improve the generalization performance of models in cross - task settings. The experimental results show that the instruction - tuned models are significantly better than single - task and multi - task baseline models in various evaluation settings, especially performing better in cross - task evaluations. In summary, this paper aims to fill the gaps in TQA research in the biomedical field and improve the generalization ability of models through instruction learning, so as to better cope with the challenges in practical application scenarios.