Compression Models via Meta-Learning and Structured Distillation for Named Entity Recognition.

Qing Zhang,Zhan Gao,Mei Zhang,Jianyong Duan,Hao Wang,Li He
DOI: https://doi.org/10.1109/IALP61005.2023.10336991
2023-01-01
Abstract:This paper addresses the issue of high resource consumption in named entity recognition (NER) under large models by utilizing meta-learning and structured distillation to generate lightweight models. Knowledge distillation from commonly used models in NER tasks poses challenges because of the exponentially large output space. Previous work treated it as a structured prediction task for distillation, but did not consider utilizing the feedback from the student model to optimize the student itself. Therefore, this paper proposes Meta-Structured Distillation (MSD). Specifically, this paper incorporates meta-learning into structured distillation, updating the teacher parameters based on the student's performance feedback on the dataset to obtain a better student model. Experimental results demonstrate the effectiveness of this approach, showing improvement over previous work in structured distillation.
What problem does this paper attempt to address?