Abstract:Despite decades of research, much is still unknown about the computations carried out in the human face processing network. Recently, deep networks have been proposed as a computational account of human visual processing, but while they provide a good match to neural data throughout visual cortex, they lack interpretability. We introduce a method for interpreting brain activity using a new class of deep generative models, disentangled representation learning models, which learn a low-dimensional latent space that "disentangles" different semantically meaningful dimensions of faces, such as rotation, lighting, or hairstyle, in an unsupervised manner by enforcing statistical independence between dimensions. We find that the majority of our model's learned latent dimensions are interpretable by human raters. Further, these latent dimensions serve as a good encoding model for human fMRI data. We next investigate the representation of different latent dimensions across face-selective voxels. We find that low- and high-level face features are represented in posterior and anterior face-selective regions, respectively, corroborating prior models of human face recognition. Interestingly, though, we find identity-relevant and irrelevant face features across the face processing network. Finally, we provide new insight into the few "entangled" (uninterpretable) dimensions in our model by showing that they match responses in the ventral stream and carry information about facial identity. Disentangled face encoding models provide an exciting alternative to standard "black box" deep learning approaches for modeling and interpreting human brain data. We use a class of interpretable deep neural network models, disentangled variational autoencoders (dVAEs), to analyze human fMRI data. We find that a dVAE learns human interpretable dimensions of faces, such as lighting, expression, and hairstyle, and provides as good a match to human fMRI data as matched, non-disentangled models. Our disentangled encoding approach allows us to map different disentangled features to ROI and voxel activity. A decoding analysis confirms that the model separates identity relevant and irrelevant information and reveals that the remaining entangled dimensions contain identity-relevant information. Together these results highlight the use of disentangled models for more interpretable fMRI encoding than standard deep learning models.

Deep Face Decoder: Towards understanding the embedding space of convolutional networks through visual reconstruction of deep face templates

DMT-EV: an Explainable Deep Network for Dimension Reduction.

Deeper Interpretability of Deep Networks

Disentangled deep generative models reveal coding principles of the human face processing network

DeepVisage: Making face recognition simple yet with powerful generalization skills

Discriminative Deep Feature Visualization for Explainable Face Recognition

Exploring Features and Attributes in Deep Face Recognition Using Visualization Techniques

Decoding Face Identity: A Reverse-Correlation Approach Using Deep Learning

Towards Interpretable Face Recognition

Reconstructing faces from fMRI patterns using deep generative neural networks

Explaining Deep Face Algorithms through Visualization: A Survey

Understanding Deep Face Representation Via Attribute Recovery

FaceNet: A unified embedding for face recognition and clustering

Context Prior-Based with Residual Learning for Face Detection: A Deep Convolutional Encoder–decoder Network

Deep Convolutional Neural Network Features and the Original Image

Reconstructing controllable faces from brain activity with hierarchical multiview representations

A Comprehensive Analysis of Deep Learning Based Representation for Face Recognition

The Face Inversion Effect in Deep Convolutional Neural Networks

A Deep Image Compression Framework for Face Recognition

Deep Cascade Model-Based Face Recognition: when Deep-Layered Learning Meets Small Data

Face Reconstruction from Face Embeddings using Adapter to a Face Foundation Model