Abstract:Dual-Encoders is a promising mechanism for answer retrieval in question answering (QA) systems. Currently most conventional Dual-Encoders learn the semantic representations of questions and answers merely through matching score. Researchers proposed to introduce the QA interaction features in scoring function but at the cost of low efficiency in inference stage. To keep independent encoding of questions and answers during inference stage, variational auto-encoder is further introduced to reconstruct answers (questions) from question (answer) embeddings as an auxiliary task to enhance QA interaction in representation learning in training stage. However, the needs of text generation and answer retrieval are different, which leads to hardness in training. In this work, we propose a framework to enhance the Dual-Encoders model with question answer cross-embeddings and a novel Geometry Alignment Mechanism (GAM) to align the geometry of embeddings from Dual-Encoders with that from Cross-Encoders. Extensive experimental results show that our framework significantly improves Dual-Encoders model and outperforms the state-of-the-art method on multiple answer retrieval datasets.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is the lack of effective utilization of the interactive information between questions and answers in the existing Dual - Encoders models in the answer retrieval task. Specifically, the Dual - Encoders model calculates the embedding representations of questions and answers by independently encoding them and estimates the correlation between the two through similarity scores. However, this method ignores the cross - information between questions and answers. Especially in the one - to - many situation, that is, when a question may have multiple matching answers or an answer may match multiple questions, this independent encoding method will lead to performance degradation. To improve this problem, the author proposes an enhanced framework - ENDX (Enhancing Dual - encoders with CROSS - Embeddings), which introduces the Cross - Encoders model as an additional guide. In the training stage, it bridges the gap between Dual - Encoders and Cross - Encoders through the Geometry Alignment Mechanism (GAM). GAM aligns the geometric structures of the dual encoders, enabling Dual - Encoders to better capture the complex relationships between questions and answers, thereby improving the accuracy of answer retrieval. ### Formula Summary 1. **Retrieval Loss of Dual - Encoders**: \[ L_{\text{dual}} = -\frac{1}{B} \sum_{i = 1}^{B} \log \frac{\exp(\mathbf{R}_{\text{dual}}^q_i\cdot\mathbf{R}_{\text{dual}}^a_i)}{\sum_{j = 1}^{B} \exp(\mathbf{R}_{\text{dual}}^q_i\cdot\mathbf{R}_{\text{dual}}^a_j)} \] where \(B\) is the batch size, and \(i\) and \(j\) are the indices of question - answer pairs in a given batch. 2. **Retrieval Loss of Cross - Encoders**: \[ L_{\text{cross}} = -\frac{1}{B} \sum_{i = 1}^{B} \log \frac{\exp(\mathbf{R}_{\text{cross}}^q_i\cdot\mathbf{R}_{\text{cross}}^a_i)}{\sum_{j = 1}^{B} \exp(\mathbf{R}_{\text{cross}}^q_i\cdot\mathbf{R}_{\text{cross}}^a_j)} \] 3. **Loss Function of the Geometry Alignment Mechanism (GAM)**: \[ L_{\text{ga}}=\alpha_{a|q}L_{a|q}+\alpha_{q|q}L_{q|q}+\alpha_{q|a}L_{q|a}+\alpha_{a|a}L_{a|a} \] where each loss term is defined as follows: \[ L_{q|q}=\frac{1}{B} \sum_{j = 1}^{B} \sum_{i = 1}^{B} p_{\text{cross}}(q_j|q_i)\log\frac{p_{\text{cross}}(q_j|q_i)}{p_{\text{dual}}(q_j|q_i)} \] Similarly, \(L_{a|a}\), \(L_{a|q}\) and \(L_{q|a}\) can be defined. 4. **Overall Loss Function**: \[ L = \alpha_{\text{dual}}L_{\text{dual}}

Enhancing Dual-Encoders with Question and Answer Cross-Embeddings for Answer Retrieval

Dual Path Multi-Modal High-Order Features for Textual Content Based Visual Question Answering

Exploring Dual Encoder Architectures for Question Answering

QAEncoder: Towards Aligned Representation Learning in Question Answering System

Triple-Joint Modeling for Question Generation Using Cross-Task Autoencoder.

Improving Lexical Embeddings for Robust Question Answering

EEE-QA: Exploring Effective and Efficient Question-Answer Representations

Neural-Symbolic Entangled Framework for Complex Query Answering

Bilateral Cross-Modality Graph Matching Attention for Feature Fusion in Visual Question Answering

Question Answering With Character-Level Lstm Encoders And Model-Based Data Augmentation

Multi-granularity Contrastive Cross-modal Collaborative Generation for End-to-End Long-term Video Question Answering

Question-Guided Semantic Dual-Graph Visual Reasoning with Novel Answers.

Question Answering and Question Generation as Dual Tasks

Context-aware Multi-level Question Embedding Fusion for visual question answering

Foundation Models and Adaptive Feature Selection: A Synergistic Approach to Video Question Answering

A Dual Attentive Neural Network Framework with Community Metadata for Answer Selection.

Enhanced Answer Selection in CQA Using Multi-Dimensional Features Combination

Learning to Generate Question by Asking Question: A Primal-Dual Approach with Uncommon Word Generation

Dual-feature collaborative relation-attention networks for visual question answering

Addressing Semantic Drift in Question Generation for Semi-Supervised Question Answering

Empowering Dual-Encoder with Query Generator for Cross-Lingual Dense Retrieval