SS-BERT: A Semantic Information Selecting Approach for Open-Domain Question Answering
Xuan Fu,Jiangnan Du,Hai-Tao Zheng,Jianfeng Li,Cuiqin Hou,Qiyu Zhou,Hong-Gee Kim
DOI: https://doi.org/10.3390/electronics12071692
IF: 2.9
2023-04-04
Electronics
Abstract:Open-Domain Question Answering (Open-Domain QA) aims to answer any factoid questions from users. Recent progress in Open-Domain QA adopts the "retriever-reader" structure, which has proven effective. Retriever methods are mainly categorized as sparse retrievers and dense retrievers. In recent work, the dense retriever showed a stronger semantic interpretation than the sparse retriever. When training a dual-encoder dense retriever for document retrieval and reranking, there are two challenges: negative selection and a lack of training data. In this study, we make three major contributions to this topic: negative selection by query generation, data augmentation from negatives, and a passage evaluation method. We prove that the model performs better by focusing on false negatives and data augmentation in the Open-Domain QA passage rerank task. Our model outperforms other single dual-encoder rerankers over BERT-base and BM25 by 0.7 in MRR@10, achieving the highest Recall@50 and the max Recall@1000, which is restricted by the BM25 retrieval results.
engineering, electrical & electronic,computer science, information systems,physics, applied