Hypercolumn-array based image representation and its application to shape-based object detection.

Hui Wei,Zheng Dong,Bing Liu
DOI: https://doi.org/10.1016/j.asoc.2016.10.031
IF: 8.7
2017-01-01
Applied Soft Computing
Abstract:We simulate the mechanism of hypercolumns in the V1 cortex of mammals that selectively responds to bar stimuli, and design an orderly-arranged array to extract and represent edges in an image.Based on the neighborhood of units in the array, we construct a graph whose node represents a short line segment of a contour.We search along the routes in that graph and compare them with a shape template for object detection.Organizing segments of contour into a graph greatly upgrades the level of image representation, remarkably reduces the load of combinations, significantly improves the efficiency of object searching, and facilitates the intervening of high-level knowledge. Biological and psychological evidence increasingly reveals that high-level geometrical and topological features are the keys to shape-based object recognition in the brain. Attracted by the excellent performance of neural visual systems, we simulate the mechanism of hypercolumns in the mammalian cortical area V1 that selectively responds to oriented bar stimuli. We design an orderly-arranged hypercolumn array to extract and represent linear or near-linear stimuli in an image. Each unit of this array covers stimuli of various orientations in a small area, and multiple units together produce a low-dimensional vector to describe shape. Based on the neighborhood of units in the array, we construct a graph whose node represents a short line segment with a certain position and slope. Therefore, a contour segment in the image can be represented with a route in this graph. The graph converts an image, comprised of typically unstructured raw data, into structured and semantic-enriched data. We search along the routes in the graph and compare them with a shape template for object detection. The graph greatly upgrades the level of image representation, remarkably reduces the load of combinations, significantly improves the efficiency of object searching, and facilitates the intervening of high-level knowledge. This work provides a systematic infrastructure for shape-based object recognition.
What problem does this paper attempt to address?