Abstract:Multi-turn response selection is an important branch in the field of natural language processing, which aims to select the most appropriate response based on multi-turn dialogue. Most state-of-the-art models adopt pre-trained language models (PrLMs) and multiple auxiliary tasks to enhance their ability to understand the semantics in multi-turn dialogue. However, some critical challenges still remain to be addressed. Optimizing multiple auxiliary tasks simultaneously may significantly increase the training cost. Meanwhile, the semantic gap between the optimization objectives of the main and auxiliary tasks may bring noise to pre-trained language models. To address these challenges, we propose an efficient BERT-based neural network model with local context comprehension (BERT-LCC) for multi-turn response selection. First, we propose a self-supervised learning strategy, which introduces an auxiliary task named Response Prediction in Random Sliding Windows (RPRSW). In a multi-turn dialogue, the RPRSW task takes utterances falling within a random sliding window as input and predicts whether the last utterance within the sliding window is the appropriate response for the local dialogue context. This auxiliary task can enhance BERT’s understanding of local semantic information. Second, we propose a local information fusion (LIF) mechanism that collects multi-granularity local features at different dialogue stages and employs a gating function to fuse global features with local features. Third, we introduce a simple but effective domain learning strategy to learn rich semantic information at different dialogue stages during pre-training. Experimental results on two public benchmark datasets show that BERT-LCC outperforms other state-of-the-art models.

BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning

Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data

Optimizing Multi-Task Learning for Enhanced Performance in Large Language Models

A Flexible Multi-Task Model for BERT Serving

Parameter-Efficient Transfer Learning for NLP

Multitask Fine-Tuning and Generative Adversarial Learning for Improved Auxiliary Classification

LIMIT-BERT : Linguistics Informed Multi-Task BERT.

Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling

BAM! Born-Again Multi-Task Networks for Natural Language Understanding

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

A new computationally efficient method to tune BERT networks – transfer learning

12-in-1: Multi-Task Vision and Language Representation Learning

LIMIT-BERT : Linguistic Informed Multi-Task BERT

lamBERT: Language and Action Learning Using Multimodal BERT

FLAT: Fusing layer representations for more efficient transfer learning in NLP

Improving BERT with local context comprehension for multi-turn response selection in retrieval-based dialogue systems

VMT-Adapter: Parameter-Efficient Transfer Learning for Multi-Task Dense Scene Understanding

Multi-Head Attention: Collaborate Instead of Concatenate

One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning

Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks with Unified Vision-and-Language BERTs