Keyword-Specific Acoustic Model Pruning for Open-Vocabulary Keyword Spotting

Yujie Yang,Kun Zhang,Zhiyong Wu,Helen Meng
DOI: https://doi.org/10.1109/icassp49357.2023.10094747
2023-01-01
ICASSP
Abstract:The open-vocabulary KWS system allows users to customize wake words, but its application is limited by the model size. In this paper, we design a dynamic acoustic model with input-dependent parameters. We find that acoustic frames with similar pronunciation generate similar subnetworks, and different parameters contribute to recognizing different phonemes. Based on this observation, we further constrain the structural similarity among the subnetworks with the same phoneme pseudo-label, thus independent subnetworks to recognize different phonemes could be pruned out. When used in the end-to-end KWS system, the subnetworks recognizing phonemes in the keyword would be combined as a keyword-specific acoustic model, and the parameters that do not contribute to recognizing the keyword are pruned off. Experiments demonstrate that the proposed method can prune more than 80% of the parameters without performance loss.
What problem does this paper attempt to address?