Incorporating Morphological Compostions with Transformer to Improve BERT

Yuncheng Song,Shuaifei Song,Juncheng Ge,Menghan Zhang,Wei Yang
DOI: https://doi.org/10.1088/1742-6596/1486/7/072071
2020-01-01
Journal of Physics Conference Series
Abstract:BERT model achieves huge performance gains by modeling words and their subwords as input units. However, it still neglects the semantic information of morpheme which has been verified in many previous works. In this paper, we propose Transformer Morpheme Model (TMM), which is based on BERT and explores the effect of morpheme. Since the process of previous works about morpheme are context-independent.TMM model adopts Transformer to process morpheme information on the input layer to overcome this problem. Experiments on MRPC task are conducted to validate the feasibility of our model. TMM model has achieved about 1% gains over BERT model on MRPC task. The results demonstrate the superiority of our method and the effectiveness of morpheme information in the BERT model.
What problem does this paper attempt to address?