A Neuromorphic Proto-Object Based Dynamic Visual Saliency Model With a Hybrid FPGA Implementation

Jamal Molin,Chetan Thakur,Ernst Niebur,Ralph Etienne-Cummings
DOI: https://doi.org/10.1109/TBCAS.2021.3089622
Abstract:Computing and attending to salient regions of a visual scene is an innate and necessary preprocessing step for both biological and engineered systems performing high-level visual tasks including object detection, tracking, and classification. Computational bandwidth and speed are improved by preferentially devoting computational resources to salient regions of the visual field. The human brain computes saliency effortlessly, but modeling this task in engineered systems is challenging. We first present a neuromorphic dynamic saliency model, which is bottom-up, feed-forward, and based on the notion of proto-objects with neurophysiological spatio-temporal features requiring no training. Our neuromorphic model outperforms state-of-the-art dynamic visual saliency models in predicting human eye fixations (i.e., ground truth saliency). Secondly, we present a hybrid FPGA implementation of the model for real-time applications, capable of processing 112×84 resolution frames at 18.71 Hz running at a 100 MHz clock rate - a 23.77× speedup from the software implementation. Additionally, our fixed-point model of the FPGA implementation yields comparable results to the software implementation.
What problem does this paper attempt to address?