Wirtinger calculus based gradient descent and levenberg-marquardt learning algorithms in complex-valued neural networks

Md. Faijul Amin,Muhammad Ilias Amin,A. Y. H. Al-Nuaimi,Kazuyuki Murase
DOI: https://doi.org/10.1007/978-3-642-24955-6_66
2011-01-01
Abstract:Complex-valued neural networks (CVNNs) bring in nonholomorphic functions in two ways: (i) through their loss functions and (ii) the widely used activation functions. The derivatives of such functions are defined in Wirtinger calculus. In this paper, we derive two popular algorithms—the gradient descent and the Levenberg-Marquardt (LM) algorithm—for parameter optimization in the feedforward CVNNs using the Wirtinger calculus, which is simpler than the conventional derivation that considers the problem in real domain. While deriving the LM algorithm, we solve and use the result of a least squares problem in the complex domain,$\|\mathbf{b-(Az+Bz^*)}\|_{\underset{\mathbf{z}}{\min}}$, which is more general than the $\|\mathbf{b-Az}\|_{\underset{\mathbf{z}}{\min}}$. Computer simulation results exhibit that as with the real-valued case, the complex-LM algorithm provides much faster learning with higher accuracy than the complex gradient descent algorithm. $|\mathbf{b-(Az+Bz^*)}\|_{\underset{\mathbf{z}}{\min}}$, which is more general than the $\|\mathbf{b-Az}\|_{\underset{\mathbf{z}}{\min}}$. Computer simulation results exhibit that as with the real-valued case, the complex-LM algorithm provides much faster learning with higher accuracy than the complex gradient descent algorithm. $|\mathbf{b-(Az+Bz^*)}\|_{\underset{\mathbf{z}}{\min}}$, which is more general than the $\|\mathbf{b-Az}\|_{\underset{\mathbf{z}}{\min}}$. Computer simulation results exhibit that as with the real-valued case, the complex-LM algorithm provides much faster learning with higher accuracy than the complex gradient descent algorithm. $|\mathbf{b-(Az+Bz^*)}\|_{\underset{\mathbf{z}}{\min}}$, which is more general than the $\|\mathbf{b-Az}\|_{\underset{\mathbf{z}}{\min}}$. Computer simulation results exhibit that as with the real-valued case, the complex-LM algorithm provides much faster learning with higher accuracy than the complex gradient descent algorithm.
What problem does this paper attempt to address?