Backpropagation through Back Substitution with a Backslash

Alan Edelman,Ekin Akyürek,Yuyang Wang
DOI: https://doi.org/10.1137/22m1532871
IF: 1.908
2024-02-09
SIAM Journal on Matrix Analysis and Applications
Abstract:SIAM Journal on Matrix Analysis and Applications, Volume 45, Issue 1, Page 429-449, March 2024. We present a linear algebra formulation of backpropagation which allows the calculation of gradients by using a generically written "backslash" or Gaussian elimination on triangular systems of equations. Generally, the matrix elements are operators. This paper has three contributions: (i) it is of intellectual value to replace traditional treatments of automatic differentiation with a (left acting) operator theoretic, graph-based approach; (ii) operators can be readily placed in matrices in software in programming languages such as Julia as an implementation option; (iii) we introduce a novel notation, "transpose dot" operator "[math]" that allows for the reversal of operators. We further demonstrate the elegance of the operators approach in a suitable programming language consisting of generic linear algebra operators such as Julia [Bezanson et al., SIAM Rev., 59 (2017), pp. 65–98], and that it is possible to realize this abstraction in code. Our implementation shows how generic linear algebra can allow operators as elements of matrices. In contrast to "operator overloading," where backslash would normally have to be rewritten to take advantage of operators, with "generic programming" there is no such need.
mathematics, applied
What problem does this paper attempt to address?