Abstract:Machine Reading Comprehension (MRC) is an important NLP task with the goal of extracting answers to user questions from background passages. For conversational applications, modeling the contexts under the multi-turn setting is highly necessary for MRC, which has drawn great attention recently. Past studies on multi-turn MRC usually focus on a single domain, ignoring the fact that knowledge in different MRC tasks are transferable. To address this issue, we present a unified framework to model both single-turn and multi-turn MRC tasks which allows knowledge sharing from different source MRC tasks to help solve the target MRC task. Specifically, the Context-Aware Transferable Bidirectional Encoder Representations from Transformers (CAT-BERT) model is proposed, which jointly learns to solve both single-turn and multi-turn MRC tasks in a single pre-trained language model. In this model, both history questions and answers are encoded into the contexts for the multi-turn setting. To capture the task-level importance of different layer outputs, a task-specific attention layer is further added to the CAT-BERT outputs, reflecting the positions that the model should pay attention to for a specific MRC task. Extensive experimental results and ablation studies show that CAT-BERT achieves competitive results in multi-turn MRC tasks, outperforming strong baselines.

Unified Multi-Criteria Chinese Word Segmentation with BERT

A Concise Model for Multi-Criteria Chinese Word Segmentation with Transformer Encoder.

Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning

Unsupervised Chinese Word Segmentation with BERT Oriented Probing and Transformation

Effective Neural Solution for Multi-criteria Word Segmentation

Adversarial Multi-Criteria Learning for Chinese Word Segmentation

BERT Meets Chinese Word Segmentation

Switch-LSTMs for Multi-Criteria Chinese Word Segmentation.

A Joint Model for Unsupervised Chinese Word Segmentation.

Pre-training with Meta Learning for Chinese Word Segmentation.

Bidirectional LSTM-CRF Attention-based Model for Chinese Word Segmentation

MVP-BERT: Redesigning Vocabularies for Chinese BERT and Multi-Vocab Pretraining

A Deep Convolutional Neural Model for Character-Based Chinese Word Segmentation

Segment, Mask, and Predict: Augmenting Chinese Word Segmentation with Self-Supervision

Character, Word, or Both? Revisiting the Segmentation Granularity for Chinese Pre-trained Language Models

RethinkCWS: is Chinese Word Segmentation a Solved Task?

Multiple Character Embeddings for Chinese Word Segmentation

CAT-BERT: A Context-Aware Transferable BERT Model for Multi-turn Machine Reading Comprehension.

"Is Whole Word Masking Always Better for Chinese BERT?": Probing on Chinese Grammatical Error Correction

A Hybrid Approach to Chinese Word Segmentation around CRFs

Unified Framework of Performing Chinese Word Segmentation and Part-Of-Speech Tagging