Reconstructing Voice Identity from Noninvasive Auditory Cortex Recordings

Charly Lamothe,Etienne Thoret,Régis Trapeau,Bruno L. Giordano,Julien Sein,Sylvain Takerkart,Stéphane Ayache,Thierry Artières,Pascal Belin
DOI: https://doi.org/10.1101/2024.02.27.582302
2024-03-19
Abstract:The cerebral processing of voice information is known to engage, in human as well as non-human primates, “temporal voice areas” (TVAs) that respond preferentially to conspecific vocalizations. However, how voice information is represented by neuronal populations in these areas, particularly speaker identity information, remains poorly understood. Here, we used a deep neural network (DNN) to generate a high-level, small-dimension representational space for voice identity—the ‘voice latent space’ (VLS)—and examined its linear relation with cerebral activity via encoding, representational similarity, and decoding analyses. We find that the VLS maps onto fMRI measures of cerebral activity in response to tens of thousands of voice stimuli from hundreds of different speaker identities and better accounts for the representational geometry for speaker identity in the TVAs than in A1. Moreover, the VLS allowed TVA-based reconstructions of voice stimuli that preserved essential aspects of speaker identity as assessed by both machine classifiers and human listeners. These results indicate that the DNN-derived VLS provides high-level representations of voice identity information in the TVAs.
Neuroscience
What problem does this paper attempt to address?