Low-level features predict perceived similarity for naturalistic images

Emily J A-Izzeddin,Thomas S. A. Wallis,Jason B Mattingley,William J Harrison
DOI: https://doi.org/10.1101/2024.08.15.607867
2024-08-19
Abstract:The mechanisms by which humans perceptually organise individual regions of a visual scene to generate a coherent scene representation remain largely unknown. Our perception of statistical regularities has been relatively well-studied in simple stimuli, and explicit computational mechanisms that use low-level image features (e.g., luminance, contrast energy) to explain these perceptions have been described. Here, we investigate to what extent observers can effectively use such low-level information present in isolated naturalistic scene regions to facilitate associations between said regions. Across two experiments, participants were shown an isolated standard patch, then required to select which of two subsequently presented patches came from the same scene as the standard (2AFC). In Experiment 1, participants were consistently above chance when performing such association judgements. Additionally, participants' responses were well-predicted by a generalised linear multilevel model (GLMM) employing predictors based on low-level feature similarity metrics (specifically, pixel-wise luminance and phase-invariant structure correlations). In Experiment 2, participants were presented with thresholded image regions, or regions reduced to only their edge content. Their performance was significantly poorer when they viewed unaltered image regions. Nonetheless, the model still correlated well with participants' judgments. Our findings suggest that image region associations can be reduced to low-level feature correlations, providing evidence for the contribution of such basic features to judgements made on complex visual stimuli.
Neuroscience
What problem does this paper attempt to address?
The paper attempts to address the issue of how humans utilize low-level features (such as brightness, contrast, etc.) to associate different regions in natural scenes and form a coherent scene representation. Specifically, the study explores through two experiments whether participants can effectively use these low-level features to determine which fragments come from the same scene when faced with isolated scene segments. The research further investigates how participants' judgment abilities are affected when image regions are simplified to edges or thresholded processing. The overall goal is to understand the role of low-level visual features in the judgment of complex visual stimuli and their contribution to the scene mapping process.