Learning transferable features in meta-learning for few-shot text classification

Jincheng Xu,Qingfeng Du
DOI: https://doi.org/10.1016/j.patrec.2020.05.007
IF: 4.757
2020-07-01
Pattern Recognition Letters
Abstract:One of the main issues accompanying the current deep learning models for classification problems is that massive data has to be fed into the training process, while obtaining sufficient annotated samples is usually time-consuming and labor-intensive. To address the problem of few-shot learning, meta-learning has made significant progress recently, which encourages fast adaptation to solve new learning tasks with only a limited number of training examples. Although it has gained increasing attention and impact in the realm of computer vision, its practical use in natural language processing has been rarely exploited. To learn transferable features effectively for few-shot text classification in a meta-learning framework, we propose a novel method which leverages the maximum mean discrepancy metric of the adaptation layer to minimize the distance between the support and query distributions within each learning task and regularize the meta-gradient update. We perform extensive empirical experiments to suggest best practices, as well as evaluate the effectiveness of our proposed method. In comparison with the well-established Model-Agnostic Meta-Learning (MAML) framework, our method achieves better generalization performance and produces remarkable accuracy gains on various text classification datasets.
computer science, artificial intelligence
What problem does this paper attempt to address?