Neural Encoding for Image Recall: Human-Like Memory

Virgile Foussereau,Robin Dumas
2024-09-18
Abstract:Achieving human-like memory recall in artificial systems remains a challenging frontier in computer vision. Humans demonstrate remarkable ability to recall images after a single exposure, even after being shown thousands of images. However, this capacity diminishes significantly when confronted with non-natural stimuli such as random textures. In this paper, we present a method inspired by human memory processes to bridge this gap between artificial and biological memory systems. Our approach focuses on encoding images to mimic the high-level information retained by the human brain, rather than storing raw pixel data. By adding noise to images before encoding, we introduce variability akin to the non-deterministic nature of human memory encoding. Leveraging pre-trained models' embedding layers, we explore how different architectures encode images and their impact on memory recall. Our method achieves impressive results, with 97% accuracy on natural images and near-random performance (52%) on textures. We provide insights into the encoding process and its implications for machine learning memory systems, shedding light on the parallels between human and artificial intelligence memory mechanisms.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper aims to address the issue of how to achieve human-like memory capabilities in artificial systems. Specifically, the researchers hope to find a method that allows machines to remember a large number of natural images after seeing them once, and to perform at near-random levels when faced with unnatural stimuli (such as random textures). By comparing different encoder models (such as CLIP and AlexNet) and introducing noise or blur during the encoding process to simulate the uncertainty in human memory, the researchers found: - For natural images, using the CLIP encoder and adding an appropriate level of Gaussian noise (standard deviation of 20), the system can achieve a recognition accuracy of 98%, while its performance on texture images is close to random (52%). - In contrast, although AlexNet has better recognition performance on texture images under low noise conditions, its performance significantly declines when noise is increased. These results indicate that by choosing appropriate encoding methods and pre-trained models, the memory capabilities of artificial systems can be effectively enhanced, making them more similar to human memory mechanisms.