Abstract:The human cortex encodes information in complex networks that can be anatomically dispersed and variable in their microstructure across individuals. Using simulations with neural network models, we show that contemporary statistical methods for functional brain imaging-including univariate contrast, searchlight multivariate pattern classification, and whole-brain decoding with L1 or L2 regularization-each have critical and complementary blind spots under these conditions. We then introduce the sparse-overlapping-sets (SOS) LASSO-a whole-brain multivariate approach that exploits structured sparsity to find network-distributed information-and show in simulation that it captures the advantages of other approaches while avoiding their limitations. When applied to fMRI data to find neural responses that discriminate visually presented faces from other visual stimuli, each method yields a different result, but existing approaches all support the canonical view that face perception engages localized areas in posterior occipital and temporal regions. In contrast, SOS LASSO uncovers a network spanning all four lobes of the brain. The result cannot reflect spurious selection of out-of-system areas because decoding accuracy remains exceedingly high even when canonical face and place systems are removed from the dataset. When used to discriminate visual scenes from other stimuli, the same approach reveals a localized signal consistent with other methods-illustrating that SOS LASSO can detect both widely distributed and localized representational structure. Thus, structured sparsity can provide an unbiased method for testing claims of functional localization. For faces and possibly other domains, such decoding may reveal representations more widely distributed than previously suspected.SIGNIFICANCE STATEMENT Brain systems represent information as patterns of activation over neural populations connected in networks that can be widely distributed anatomically, variable across individuals, and intermingled with other networks. We show that four widespread statistical approaches to functional brain imaging have critical blind spots in this scenario and use simulations with neural network models to illustrate why. We then introduce a new approach designed specifically to find radically distributed representations in neural networks. In simulation and in fMRI data collected in the well studied domain of face perception, the new approach discovers extensive signal missed by the other methods-suggesting that prior functional imaging work may have significantly underestimated the degree to which neurocognitive representations are distributed and variable across individuals.

Local vs distributed representations: What is the right basis for interpretability?

The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks

Identifying Interpretable Visual Features in Artificial and Biological Neural Systems

When and where do feed-forward neural networks learn localist representations?

Functional Network: A Novel Framework for Interpretability of Deep Neural Networks

Visual Interpretability forDeepLearning

Interpreting Deep Neural Networks Through Variable Importance

How good Neural Networks interpretation methods really are? A quantitative benchmark

A Survey on Neural Network Interpretability

A Survey of the Interpretability Aspect of Deep Learning Models

Neural network interpretability with layer-wise relevance propagation: novel techniques for neuron selection and visualization

Visual Interpretability for Deep Learning: a Survey

Interpretability of Machine Learning Methods Applied to Neuroimaging

A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts

DISCOVER: Making Vision Networks Interpretable via Competition and Dissection

Representations as Language: An Information-Theoretic Framework for Interpretability

Finding Distributed Needles in Neural Haystacks

Don't trust your eyes: on the (un)reliability of feature visualizations

Investigating the influence of noise and distractors on the interpretation of neural networks

Interpretability of deep learning models: A survey of results

Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks