LossLens: Diagnostics for Machine Learning through Loss Landscape Visual Analytics

Tiankai Xie,Jiaqing Chen,Yaoqing Yang,Caleb Geniesse,Ge Shi,Ajinkya Chaudhari,John Kevin Cava,Michael W. Mahoney,Talita Perciano,Gunther H. Weber,Ross Maciejewski
2024-12-18
Abstract:Modern machine learning often relies on optimizing a neural network's parameters using a loss function to learn complex features. Beyond training, examining the loss function with respect to a network's parameters (i.e., as a loss landscape) can reveal insights into the architecture and learning process. While the local structure of the loss landscape surrounding an individual solution can be characterized using a variety of approaches, the global structure of a loss landscape, which includes potentially many local minima corresponding to different solutions, remains far more difficult to conceptualize and visualize. To address this difficulty, we introduce LossLens, a visual analytics framework that explores loss landscapes at multiple scales. LossLens integrates metrics from global and local scales into a comprehensive visual representation, enhancing model diagnostics. We demonstrate LossLens through two case studies: visualizing how residual connections influence a ResNet-20, and visualizing how physical parameters influence a physics-informed neural network (PINN) solving a simple convection problem.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How to more comprehensively understand and visualize the global and local structures of the neural network loss function, so as to enhance the interpretability and diagnostic ability of the model architecture and the learning process. ### Problem Background Modern machine learning usually relies on minimizing the loss function by optimizing the parameters of the neural network to learn complex features. Although the local structure of the loss function near a specific solution can be studied by multiple methods, the global structure of the loss function (including multiple local minima) is still difficult to conceptualize and visualize. This limits researchers' understanding of model performance, generalization ability, and training dynamics. ### Paper Solution To solve this problem, the author introduced a visual analysis framework named **LossLens**. LossLens aims to explore and display the loss landscape in a multi - scale (local and global) manner, helping researchers and practitioners better understand the characteristics of neural network models. Specifically: 1. **Local Analysis**: - LossLens provides single - model - level metrics and local views, and users can analyze properties such as the sharpness and curvature of the loss landscape around a specific solution. - Use the Hessian matrix to evaluate local flatness or sharpness, as shown in the formula: \[ \lambda_i(\nabla^2_\theta L(\theta)), \quad i \in [1, 10] \] where \(\lambda_i\) is the eigenvalue of the Hessian matrix. 2. **Global Analysis**: - LossLens combines all trained models into a fully - connected graph and uses a two - dimensional layout based on Centered Kernel Alignment (CKA) similarity to summarize the global landscape. - Evaluate the connectivity between different models through Mode Connectivity, as shown in the formula: \[ mc(\theta, \theta')=\frac{1}{2}(L(\theta)+L(\theta')) - L(\gamma_\phi(t^*)) \] where \(t^*\) maximizes the deviation \(|\frac{1}{2}(L(\theta)+L(\theta')) - L(\gamma_\phi(t))|\). 3. **Integrated Representation**: - LossLens provides an integrated visual representation that links metrics at different scales, enabling users to examine the model from multiple perspectives and at different levels. - For example, compare the feature similarity between different layers through the CKA similarity matrix, and capture topological features through Persistent Homology and Merge Tree. ### Application Cases The paper presents two case studies to illustrate how LossLens can be used to extract novel insights about different aspects of model development: 1. **The Influence of Residual Connections in ResNet - 20**: Visualize how residual connections affect the loss landscape of ResNet - 20. 2. **The Influence of Physical Parameters in Physics - Informed Neural Networks (PINN)**: Visualize how physical parameters affect the process of PINN solving simple convection problems. Through these methods, LossLens provides researchers with a systematic framework, enabling them to deeply understand the loss landscape of neural networks from local and global perspectives, thereby improving the interpretability and practicality of the model.