Abstract:Measuring the similarity of the internal representations of deep neural networks is an important and challenging problem. Model stitching has been proposed as a possible approach, where two half-networks are connected by mapping the output of the first half-network to the input of the second one. The representations are considered functionally similar if the resulting stitched network achieves good task-specific performance. The mapping is normally created by training an affine stitching layer on the task at hand while freezing the two half-networks, a method called task loss matching. Here, we argue that task loss matching may be very misleading as a similarity index. For example, it can indicate very high similarity between very distant layers, whose representations are known to have different functional properties. Moreover, it can indicate very distant layers to be more similar than architecturally corresponding layers. Even more surprisingly, when comparing layers within the same network, task loss matching often indicates that some layers are more similar to a layer than itself. We argue that the main reason behind these problems is that task loss matching tends to create out-of-distribution representations to improve task-specific performance. We demonstrate that direct matching (when the mapping minimizes the distance between the stitched representations) does not suffer from these problems. We compare task loss matching, direct matching, and well-known similarity indices such as CCA and CKA. We conclude that direct matching strikes a good balance between the structural and functional requirements for a good similarity index.
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is: **How to measure the similarity of internal representations in deep neural networks more accurately**, especially regarding the effectiveness and reliability of Task Loss Matching (TLM) and Direct Matching (DM) methods in the model stitching method.
### Background and Problem Description
1. **Measurement of Internal Representation Similarity**:
- Measuring the similarity of internal representations in deep neural networks is an important and challenging problem. This involves not only structural similarity (such as geometric characteristics of activation distributions) but also functional similarity (i.e., whether one representation can be transformed into another while maintaining its functionality).
2. **Model Stitching Method**:
- Model stitching is a method of connecting two half - networks, usually by training an affine stitching layer, while these two half - networks are fixed. If the stitched network performs well on a specific task, these representations are considered functionally similar.
### Main Contributions of the Paper
1. **Limitations of Task Loss Matching**:
- The author points out that task loss matching, as a similarity index, may be misleading. For example, it may wrongly indicate high similarity between very different layers. Even within the same network, some layers may be considered more similar to themselves. The reasons for these problems are that task loss matching tends to create out - of - distribution representations to improve task - specific performance.
2. **Advantages of Direct Matching**:
- Direct matching avoids the above problems by minimizing the distance between stitched representations. Experiments show that direct matching does not have these abnormal situations and achieves a good balance between structural and functional requirements.
3. **Mixed Similarity Index**:
- The author suggests combining structural and functional aspects to measure similarity and proposes a model - stitching method based on direct matching as an example of this mixed method.
### Experimental Results
- **Cross - Network - Layer Identification**:
- Task loss matching performs poorly in identifying corresponding layers between different networks, and PWCCA and OPD also have similar problems, while LCKA performs better.
- **Intra - Network - Layer Identification**:
- More seriously, task loss matching often fails to correctly identify whether layers within the same network are most similar to themselves. This is a basic requirement, so its failure is particularly prominent.
- **OOD Analysis**:
- Through the energy - based OOD detection method, the author finds that task loss matching does tend to generate OOD representations, even when it performs well functionally. In contrast, direct matching is more likely to generate in - distribution representations.
### Conclusion
The paper concludes that direct matching performs better in measuring the similarity of internal representations in deep neural networks because it achieves a good balance between structural and functional requirements. Although task loss matching can improve task performance in some cases, it is prone to generate OOD representations, leading to unreasonable similarity evaluations.
### Formula Summary
- **Task Loss Matching (TLM) Optimization Problem**:
\[
\arg \min_{\theta} \mathbb{E}_{p(x,y)}[L([g^{>j} \circ T_{\theta} \circ f^{\leq i}](x), y)]
\]
- **Direct Matching (DM) Optimization Problem**:
\[
\arg \min_{\theta} \mathbb{E}_{p(x)}[\| [T_{\theta} \circ f^{\leq i}](x) - g^{\leq j}(x) \|_F]
\]
- **Linear CKA Calculation Formula**:
\[
LCKA(A, B) = \frac{\|B^T A\|_F^2}{\|A^T A\|_F \|B^T B\|_F}
\]
- **Regularized Canonical Correlation Analysis (PWCCA)