A Theory on Deep Neural Network Based Vector-to-Vector Regression With an Illustration of Its Expressive Power in Speech Enhancement

Jun Qi,Jun Du,Sabato Marco Siniscalchi,Chin-Hui Lee
DOI: https://doi.org/10.1109/TASLP.2019.2935891
2019-12-01
Abstract:This paper focuses on a theoretical analysis of deep neural network DNN based functional approximation. Leveraging upon two classical theorems on universal approximation, an artificial neural network ANN with a single hidden layer of neurons is used. With modified ReLU and Sigmoid activation functions, we first generalize the related concepts to vector-to-vector regression. Then, we show that the width of the hidden layer of ANN is numerically related to the approximation of the regression function. Furthermore, we increase the number of hidden layers and show that the depth of the ANN-based regression function can enhance its expressive power. We illustrate this representation with recently-emerged DNN based speech enhancement. We first compare the expressive power by varying ANN structures and then test its related regression performance under different noisy conditions in various noise types and signal-to-noise-ratio levels. Experimental results verify our theoretical prediction that an ANN of a broader hidden layer and a deeper architecture can jointly ensure a closer approximation of the vector-to-vector regression functions in terms of the Euclidean distance between the log power spectra of noisy and expected clean speech. Moreover, a DNN with a broader width at the top hidden layer can improve the regression performance relative to those with a narrower width at the top hidden layers.
What problem does this paper attempt to address?