Low-Rank Plus Diagonal Adaptation For Deep Neural Networks

Yong Zhao,Jinyu Li,Yifan Gong
DOI: https://doi.org/10.1109/icassp.2016.7472630
2016-01-01
Abstract:In this paper, we propose a scalable adaptation technique that adapts the deep neural network (DNN) model through the low-rank plus diagonal (LRPD) decomposition. It is desired that an adaptation method can properly accommodate the available development data with a variable amount of adaptation parameters. Thus, the resulting models neither over-fit nor under-fit as the development data vary in size for different speakers. The technique developed in this paper is inspired by observing that adaptation matrices are very close to an identity matrix or diagonally dominant. The LRPD restructures the adaptation matrix as a superposition of a diagonal matrix and a low-rank matrix. By varying the low-rank values, the LRPD contains the full and the diagonal adaptation matrix as its special cases. Experimental results demonstrated that the LRPD adaptation of the full-size DNN obtains improved accuracy over the standard linear adaptation. The LRPD bottleneck adaptation can reduce the speaker-specific footprint by 82% over an already very compact SVD bottleneck adaptation, at an expense of 1% relative WER increase.
What problem does this paper attempt to address?