Early Exit with Disentangled Representation and Equiangular Tight Frame.

Yixin Ji,Jikai Wang,Juntao Li,Qiang Chen,Wenliang Chen,Min Zhang
DOI: https://doi.org/10.18653/v1/2023.findings-acl.889
2023-01-01
Abstract:Dynamic early exit has demonstrated great potential in coping with the sharply increasing number of pre-trained language model parameters, which can achieve a good trade-off be-tween performance and efficiency. The existing early exit paradigm relies on training paramet-rical internal classifiers at each intermediate layer to complete specific tasks. Based on the predictions of these internal classifiers, different methods are designed to decide when to exit. Under this circumstance, each intermediate layer takes on both generic language representation learning and task-specific feature extraction, which makes each intermediate layer struggle to balance two types of backward loss signals during training. To break this dilemma, we propose an adapter method to decouple the two distinct types of representation and further introduce a non-parametric simplex equiangu-lar tight frame classifier (ETF) for improvement. Extensive experiments on monolingual and multilingual tasks demonstrate that our method gains significant improvements over strong PLM backbones and early exit methods.
What problem does this paper attempt to address?