Recent Advances in Pre-trained Language Models: Why Do They Work and How Do They Work

Cheng-Han Chiang,Yung-Sung Chuang,Hung-yi Lee
DOI: https://doi.org/10.18653/v1/2022.aacl-tutorials.2
2022-01-01
Abstract:Pre-trained language models (PLMs) are language models that are pre-trained on large-scaled corpora in a self-supervised fashion. These PLMs have fundamentally changed the natural language processing community in the past few years. In this tutorial, we aim to provide a broad and comprehensive introduction from two perspectives: why those PLMs work, and how to use them in NLP tasks. The first part of the tutorial shows some insightful analysis on PLMs that partially explain their exceptional downstream performance. The second part first focuses on emerging pre-training methods that enable PLMs to perform diverse downstream tasks and then illustrates how one can apply those PLMs to downstream tasks under different circumstances. These circumstances include fine-tuning PLMs when under data scarcity, and using PLMs with parameter efficiency. We believe that attendees of different backgrounds would find this tutorial informative and useful.
What problem does this paper attempt to address?