Generative visual common sense: Testing analysis-by-synthesis on Mondrian-style image.

Ning Tang,Siyi Gong,Jifan Zhou,Mowei Shen,Tao Gao
DOI: https://doi.org/10.1037/xge0001413
2023-05-18
Abstract:The well-known Mondrian-style images, aside from being aesthetically amusing, also reflect the core principles of human vision in their viewing experience. First, when we see a Mondrian-style image consisting only of a grid and primary colors, we may automatically interpret its causal history such that it was generated by recursively partitioning a blank scene. Second, the image we observe is open to many possible ways of partitioning, and their probabilities of dominating the interpretation can be captured by a probabilistic distribution. Moreover, the causal interpretation of a Mondrian-style image can emerge almost spontaneously, not being tailored to any specific task. Using Mondrian-style images as a case study, we demonstrate the generative nature of human vision by showing that a Bayesian model based upon an image-generation task can support a wide range of visual tasks with little retraining. Our model, learned from human-synthesized Mondrian-style images, could predict human performance in the perceptual complexity ranking, capture the transmission stability when images were iteratively passed among participants, and pass a visual Turing test. Our results collectively show that human vision is causal such that we interpret an image from the angle of how it was generated. The success of generalization with little retraining suggests that generative vision constitutes a type of common sense that supports a wide range of tasks of different natures. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
psychology, experimental
What problem does this paper attempt to address?