Distilling Knowledge from BERT into Simple Fully Connected Neural Networks for Efficient Vertical Retrieval.

Peiyang Liu,Xi Wang,Lin Wang,Wei Ye,Xiangyu Xi,Shikun Zhang
DOI: https://doi.org/10.1145/3459637.3481909
2021-01-01
Abstract:Distilled BERT models are more suitable for efficient vertical retrieval in online sponsored vertical search with low-latency requirements than BERT due to fewer parameters and faster inference. Unfortunately, most of these models are still far from ideal inference speed. This paper presents a novel and effective method to distill knowledge from BERT into simple fully connected neural networks (FNN). Results of extensive experiments on English and Chinese datasets demonstrate that our method achieves comparable results with existing distilled BERT models while the inference is accelerated by more than ten times. We have successfully applied our method on our online sponsored vertical search engine and get remarkable improvements.
What problem does this paper attempt to address?