Leveraging the Graph Structure of Neural Network Training Dynamics

Fatemeh Vahedian,Ruiyu Li,Puja Trivedi,Di Jin,Danai Koutra
DOI: https://doi.org/10.1145/3511808.3557628
2022-01-01
Abstract:Understanding the training dynamics of deep neural networks (DNNs) is important as it can lead to improved training efficiency and task performance. Recent works have demonstrated that representing the wirings of neurons in feedforward DNNs as graphs is an effective strategy for understanding how architectural choices can affect performance. However, these approaches fail to model training dynamics since a single, static graph cannot capture how DNNs change over the course of training. Thus, in this work, we propose a compact, expressive temporal graph framework that effectively captures the dynamics of many workhorse architectures in computer vision. Specifically, our framework extracts an informative summary of graph properties (e.g., degree, eigenvector centrality) over a sequence of DNN graphs obtained during training. We demonstrate that the proposed framework captures useful dynamics by accurately predicting trained, task performance when using a summary over early training epochs (<5) across four different architectures and two image datasets. Moreover, by using a novel, highly-scalable DNN graph representation, we further demonstrate that the proposed framework captures generalizable dynamics as summaries extracted from smaller-width networks are effective when evaluated on larger widths.
What problem does this paper attempt to address?