Abstract:Predictive coding is a leading theory of cortical function which posits that the brain continually makes predictions of incoming sensory stimuli using a hierarchical network of top-down and bottom-up connections. This theory is supported by prior work showing that PredNet, a deep learning network designed according to predictive coding principles, exhibits several characteristics of neural responses commonly observed in primate visual cortex. However, one ubiquitous neural phenomenon that has not yet been investigated is short-term visual adaptation: the adjustment of neural responses over time when exposed to static visual inputs that are either prolonged or directly repeated. Here, we examine whether PredNet exhibits two neural signatures of temporal adaptation previously observed in intracranial recordings of human participants viewing prolonged and repeated stimuli (Brands et al., 2024). We find that, like human visual cortex, PredNet adapts to static images, evidenced by subadditive temporal response summation: a non-linear accumulation of response magnitudes when prolonging stimulus durations, which results from neurally plausible transient-sustained dynamics in the unit activation time courses. However, PredNet activations also show a systematic response to stimulus offsets, which is absent in the human neural data. For repeated stimuli, PredNet shows slight response suppression for any two images presented in quick succession, but no repetition suppression, a comparatively stronger response reduction for identical than for non-identical image pairs that is robustly observed throughout human visual cortex. We show that these results are stable across multiple training datasets and two different types of loss computation. Lastly, in both PredNet and the neural data, we find a relationship between temporal adaptation and visual input properties, showing that temporally sustained activity is enhanced for more complex scenes containing clutter. All together, these results suggest that the emergent temporal dynamics in the PredNet only partly align with neural data and are linked to low-level properties of the visual input rather than high-level predictions arising from top-down processes.

Deep predictive coding networks partly capture neural signatures of short-term temporal adaptation in human visual cortex