Abstract:The generally unsupervised nature of autoencoder models implies that the main training metric is formulated as the error between input images and their corresponding reconstructions. Different reconstruction loss variations and latent space regularizations have been shown to improve model performances depending on the tasks to solve and to induce new desirable properties such as disentanglement. Nevertheless, measuring the success in, or enforcing properties by, the input pixel space is a challenging endeavour. In this work, we want to make use of the available data more efficiently and provide design choices to be considered in the recording or generation of future datasets to implicitly induce desirable properties during training. To this end, we propose a new sampling technique which matches semantically important parts of the image while randomizing the other parts, leading to salient feature extraction and a neglection of unimportant details. The proposed method can be combined with any existing reconstruction loss and the performance gain is superior to the triplet loss. We analyse the resulting properties on various datasets and show improvements on several computer vision tasks: illumination and unwanted features can be normalized or smoothed out and shadows are removed such that classification or other tasks work more reliably; a better invariances with respect to unwanted features is induced; the generalization capacities from synthetic to real images is improved, such that more of the semantics are preserved; uncertainty estimation is superior to Monte Carlo Dropout and an ensemble of models, particularly for datasets of higher visual complexity. Finally, classification accuracy by means of simple linear classifiers in the latent space is improved compared to the triplet loss. For each task, the improvements are highlighted on several datasets commonly used by the research community, as well as in automotive applications.

How Much Training Data is Memorized in Overparameterized Autoencoders? An Inverse Problem Perspective on Memorization Evaluation

Modeling Visual Memorability Assessment with Autoencoders Reveals Characteristics of Memorable Images

Learn to Forget: Memorization Elimination for Neural Networks.

Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection

Exploring Memorization in Adversarial Training

Memorized Images in Diffusion Models share a Subspace that can be Located and Deleted

How Does the Memorization of Neural Networks Impact Adversarial Robust Models?

Measuring Catastrophic Forgetting in Neural Networks

An Inversion-based Measure of Memorization for Diffusion Models

Memorization and Generalization in Neural Code Intelligence Models

Autoencoder and Partially Impossible Reconstruction Losses

Generalizability of Memorization Neural Networks

Associative Memory in Iterated Overparameterized Sigmoid Autoencoders

On the Over-Memorization During Natural, Robust and Catastrophic Overfitting

Unintended Memorization in Large ASR Models, and How to Mitigate It

Memorization in deep learning: A survey

Memorization with neural nets: going beyond the worst case

Changing the Image Memorability: From Basic Photo Editing to GANs

On Memorization in Diffusion Models

Pivotal Auto-Encoder via Self-Normalizing ReLU

Measuring Forgetting of Memorized Training Examples