Finding Closure: A Closer Look at the Gestalt Law of Closure in Convolutional Neural Networks

Yuyan Zhang,Derya Soydaner,Lisa Koßmann,Fatemeh Behrad,Johan Wagemans
2024-08-22
Abstract:The human brain has an inherent ability to fill in gaps to perceive figures as complete wholes, even when parts are missing or fragmented. This phenomenon is known as Closure in psychology, one of the Gestalt laws of perceptual organization, explaining how the human brain interprets visual stimuli. Given the importance of Closure for human object recognition, we investigate whether neural networks rely on a similar mechanism. Exploring this crucial human visual skill in neural networks has the potential to highlight their comparability to humans. Recent studies have examined the Closure effect in neural networks. However, they typically focus on a limited selection of Convolutional Neural Networks (CNNs) and have not reached a consensus on their capability to perform Closure. To address these gaps, we present a systematic framework for investigating the Closure principle in neural networks. We introduce well-curated datasets designed to test for Closure effects, including both modal and amodal completion. We then conduct experiments on various CNNs employing different measurements. Our comprehensive analysis reveals that VGG16 and DenseNet-121 exhibit the Closure effect, while other CNNs show variable results. We interpret these findings by blending insights from psychology and neural network research, offering a unique perspective that enhances transparency in understanding neural networks. Our code and dataset will be made available on GitHub.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to explore whether convolutional neural networks (CNNs) can perceive incomplete shapes through the Gestalt psychology principle of "Closure" as the human visual system does. Specifically, the researchers designed a series of experiments aiming to evaluate whether different types of CNNs can complete the recognition of these images through the closure effect like humans when dealing with partially missing or fragmented images. ### Research Background - **Human Visual System**: The human brain has the ability to fill in the missing parts of visual stimuli. Even if parts of an image are missing or fragmented, people can still perceive it as a complete whole. This phenomenon is called "Closure" and is an important principle in Gestalt psychology. - **Limitations of CNNs**: Although CNNs have achieved remarkable success in many computer vision tasks, how they perform when dealing with incomplete images and whether they can recognize these images through the closure effect like humans are still unsolved mysteries. ### Research Objectives - **Verify the Closure Ability of CNNs**: By designing a series of experiments, evaluate whether different types of CNNs can show the closure effect when dealing with incomplete images. - **Compare the Performance of Different CNNs**: Explore the performance differences of CNNs with different architectures in closure tasks, thus providing a new perspective for understanding the internal mechanisms of these models. ### Experimental Design 1. **Dataset Design**: - **Triangle Segment Completion**: Generate a dataset containing complete triangles, aligned triangle segments, and misaligned triangle segments to test the closure ability of CNNs. - **Kanizsa Triangles**: Use Kanizsa triangles (consisting of three incomplete discs, forming a virtual white triangle) to further test the closure ability of CNNs. 2. **Measurement Method**: - **Similarity Method**: Evaluate the closure effect by calculating the similarity between the aligned triangle segments and the complete triangle, as well as the similarity between the misaligned triangle segments and the complete triangle. The measurement value of the closure effect is defined as: \[ \text{Closure Measurement Value} = \text{Similarity}(aligned, complete) - \text{Similarity}(misaligned, complete) \] where the similarity calculation formula is: \[ \text{Similarity}(x, y) = \frac{f(x) \cdot f(y)}{\|f(x)\| \|f(y)\|} \] \( f(x) \) is the output vector of a certain layer of the model. 3. **Experimental Results**: - **Triangle Segment Completion**: The experimental results show that VGG16, EfficientNet B0, Inception V3, SqueezeNet V1.1, and ShuffleNet V2 show a strong closure effect when the edge length reaches more than 13 pixels, while AlexNet and DenseNet - 121 have a weaker effect, and ResNet - 50 and MobileNet V3 have almost no closure effect. - **Kanizsa Triangles**: Under the condition of Kanizsa triangles, most models have a weak closure effect, which may be because the current measurement method is not sensitive enough to fully capture the model's ability to perceive the dotted outline. ### Conclusion This research, through a systematic experimental design, initially reveals that some CNNs can show the closure effect when dealing with incomplete images, but the effect varies depending on the model structure and measurement method. These findings not only help to understand the internal mechanisms of CNNs but also provide new ideas for designing AI models closer to the human visual system.