Higher-order Comparisons of Sentence Encoder Representations

Mostafa Abdou,Artur Kulmizev,Felix Hill,Daniel M. Low,Anders Søgaard
DOI: https://doi.org/10.48550/arXiv.1909.00303
2019-09-05
Abstract:Representational Similarity Analysis (RSA) is a technique developed by neuroscientists for comparing activity patterns of different measurement modalities (e.g., fMRI, electrophysiology, behavior). As a framework, RSA has several advantages over existing approaches to interpretation of language encoders based on probing or diagnostic classification: namely, it does not require large training samples, is not prone to overfitting, and it enables a more transparent comparison between the representational geometries of different models and modalities. We demonstrate the utility of RSA by establishing a previously unknown correspondence between widely-employed pretrained language encoders and human processing difficulty via eye-tracking data, showcasing its potential in the interpretability toolbox for neural models
Computation and Language
What problem does this paper attempt to address?