Towards Human-Leveled Vision Systems

JianHao Ding,TieJun Huang
DOI: https://doi.org/10.1007/s11431-024-2762-5
2024-01-01
Science China Technological Sciences
Abstract:The human visual system is a complex and interconnected network comprising billions of neurons. It plays an essential role in translating environmental light stimuli into information that guides and shapes human perception and action. Research on the visual system aims to uncover the underlying neural structure principles of human visual perception and their possible applications. Currently, there are two main approaches: biological system analysis and simulation, artificial intelligence models based on deep learning. Here we aim to discuss the two approaches to human-level vision systems. Deep learning has significantly impacted the field of vision with achievements in representation, modeling, and hardware design. However, there is still a significant gap between deep learning models and the human visual system in terms of scalability, transferability, and sustainability. The progress of the biological visual system can help fill the gap by further understanding the properties and functions of different components of the system. We take the efforts of reconstructing the retina as an example to illustrate that even if we are unable to replicate the visual system on a computer right now, we can still learn a lot by combining existing research outcomes in neuroscience. At the end of the paper, we suggest tracing back to gradually build visual systems from the computational counterpart of biological structures to achieve a human-level vision system in the future.
What problem does this paper attempt to address?