Syntax-informed Question Answering with Heterogeneous Graph Transformer

Fangyi Zhu,Lok You Tan,See-Kiong Ng,Stéphane Bressan
DOI: https://doi.org/10.48550/arXiv.2204.09655
2022-05-23
Abstract:Large neural language models are steadily contributing state-of-the-art performance to question answering and other natural language and information processing tasks. These models are expensive to train. We propose to evaluate whether such pre-trained models can benefit from the addition of explicit linguistics information without requiring retraining from scratch. We present a linguistics-informed question answering approach that extends and fine-tunes a pre-trained transformer-based neural language model with symbolic knowledge encoded with a heterogeneous graph transformer. We illustrate the approach by the addition of syntactic information in the form of dependency and constituency graphic structures connecting tokens and virtual vertices. A comparative empirical performance evaluation with BERT as its baseline and with Stanford Question Answering Dataset demonstrates the competitiveness of the proposed approach. We argue, in conclusion and in the light of further results of preliminary experiments, that the approach is extensible to further linguistics information including semantics and pragmatics.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How to improve the performance of pre - trained language models in question - answering tasks by adding explicit linguistic information (such as syntactic information) without re - training the pre - trained language models. Specifically, the authors proposed a question - answering method (SyHGT) for enhancing linguistic information based on heterogeneous graph transformers. This method extends and fine - tunes a pre - trained Transformer - based neural language model and encodes the heterogeneous graph transformer through symbolic knowledge, especially connecting the syntactic information of tokens and virtual vertices in the form of dependency graphs and constituent trees. Experimental results show that, compared with using only the pre - trained BERT, this method has improved performance on the Stanford Question Answering Dataset (SQuAD), especially in terms of exact match (EM) and F1 score.