An Improved Mask Approach Based on Pointer Network for Domain Adaptation of BERT

Pengkai Lu,Dawei Jiang,Ying Li
DOI: https://doi.org/10.1088/1742-6596/1646/1/012072
2020-01-01
Journal of Physics Conference Series
Abstract:AbstractPre-trained BERT model has shown its amazing strength on downstream NLP tasks by fine-tuning. However, the results with fine-tuned BERT will decrease when the model is directly applied to a series of domain-specific tasks. he original fine-tuning method does not consider accurate semantics of tokens in a specific domain. Different from random selecting, we present a more efficient mask method which utilizes a pointer network to decide which tokens should be preferentially masked. The pointer network sorts tokens in a sentence by their recovery difficulty. Then we train a BERT model to predict top tokens that are replaced by [mask] in original sentences. We tested the new training approach on biomedical corpora. Experiments show that the new trained model outperforms the original BERT model in some domain-specific NLP tasks while consuming extra domain corpus.
What problem does this paper attempt to address?