Applying the Efficient Coding Principle to Understand Encoding of Multisensory and Multimodality Sensory Signals

Li Zhaoping
DOI: https://doi.org/10.1016/j.visres.2024.108489
IF: 1.8
2024-01-01
Vision Research
Abstract:Sensory neurons often encode multisensory or multimodal signals. For example, many medial superior temporal (MST) neurons are tuned to heading direction of self-motion based on visual (optic flow) signals and vestibular signals. Middle temporal (MT) cortical neurons are tuned to object depth from signals of two visual modalities: motion parallax and binocular disparity. A MST neuron's preferred heading directions from different senses can be congruent (matched) or opposite from each other. Similarly, the preferred depths of a MT neuron from the two modalities are congruent in some neurons and opposite in other neurons. While the congruent tuning appears natural for cue integration, the functions of the opposite tuning have been puzzling. This paper explains these tunings from the efficient coding principle that sensory encoding extracts as much sensory information as possible while minimizing neural cost. It extends the previous applications of this principle to understand neural receptive fields in retina and the primary visual cortex, particularly multimodal encoding of cone signals or binocular signals. Congruent and opposite sensory signals that excite the congruent and opposite neurons, respectively, are the decorrelated sensory components that provide a general purpose, efficient, representation of sensory inputs before task specific object segmentation and recognition. It can be extended to encoding signals from more than two sensory sources, e.g., from three cone types. This framework also predicts a wider tuning width for the opposite than congruent neurons, neurons that are neither congruent nor opposite, and how neural receptive fields adapt to statistical changes of sensory environments.
What problem does this paper attempt to address?