Evaluating Representational Similarity Measures from the Lens of Functional Correspondence

Yiqing Bo,Ansh Soni,Sudhanshu Srivastava,Meenakshi Khosla
2024-11-22
Abstract:Neuroscience and artificial intelligence (AI) both face the challenge of interpreting high-dimensional neural data, where the comparative analysis of such data is crucial for revealing shared mechanisms and differences between these complex systems. Despite the widespread use of representational comparisons and the abundance classes of comparison methods, a critical question remains: which metrics are most suitable for these comparisons? While some studies evaluate metrics based on their ability to differentiate models of different origins or constructions (e.g., various architectures), another approach is to assess how well they distinguish models that exhibit distinct behaviors. To investigate this, we examine the degree of alignment between various representational similarity measures and behavioral outcomes, employing group statistics and a comprehensive suite of behavioral metrics for comparison. In our evaluation of eight commonly used representational similarity metrics in the visual domain -- spanning alignment-based, Canonical Correlation Analysis (CCA)-based, inner product kernel-based, and nearest-neighbor methods -- we found that metrics like linear Centered Kernel Alignment (CKA) and Procrustes distance, which emphasize the overall geometric structure or shape of representations, excelled in differentiating trained from untrained models and aligning with behavioral measures, whereas metrics such as linear predictivity, commonly used in neuroscience, demonstrated only moderate alignment with behavior. These insights are crucial for selecting metrics that emphasize behaviorally meaningful comparisons in NeuroAI research.
Neurons and Cognition,Artificial Intelligence,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is in the fields of neuroscience and artificial intelligence, how to select the most appropriate representation similarity measurement method to effectively distinguish the behavioral performance of different models. Specifically, the paper explores the application effects of various commonly - used representation similarity measurement methods in the visual field, aiming to find out which measurement methods can better reflect the behavioral differences between models, especially the ability to distinguish between trained and untrained models. In addition, the paper also evaluates the consistency between these measurement methods and behavioral measurement indicators to determine which measurement methods are more functionally meaningful. ### Main contributions of the paper: 1. **Comprehensive analysis of commonly - used representation similarity measurement methods**: The paper conducts an extensive analysis of eight commonly - used representation similarity measurement methods, including those based on alignment, representation similarity matrix, CCA, and nearest neighbor methods, showing the differences in the ability of these measurements to distinguish different models. 2. **Supplementary comparison of behavioral measurement indicators**: In order to evaluate whether these distinctions reflect the behavioral differences of models, the paper uses a comprehensive set of behavioral measurement indicators for supplementary comparison and finds that behavioral measurement indicators are generally more consistent with each other than representation similarity measurements. 3. **Cross - comparison between representation similarity and behavioral measurement**: Through cross - comparison between representation similarity and behavioral measurement, the paper reveals that linear CKA and Procrustes distance are most closely aligned with behavioral evaluation, while measurement methods widely used in neuroscience such as linear predictability show a weaker alignment. ### Related work: - Although a few studies have directly compared the distinguishing abilities of representation similarity measurement methods, most studies focus on identifying measurement methods that can distinguish models according to the way models are constructed. - Some studies select measurement methods by evaluating the ability of measurement methods to match corresponding layers of models with different initializations. - The work closest to this study includes the research by Ding et al. They optimized the synthetic data set to simulate brain activity and showed that even if task - related variables are not encoded, some measurements such as linear predictability and CKA can still obtain high scores. ### Conclusion: Through systematic analysis and comparison, the paper provides important guidance that when selecting representation similarity measurement methods in online neural AI research, priority should be given to those measurement methods that can reflect model differences that are behaviorally significant. In particular, linear CKA and Procrustes distance are recommended because of their high consistency in distinguishing between trained and untrained models and with behavioral measurement. This helps researchers more effectively select measurement methods suitable for their research purposes.