Abstract:The generally unsupervised nature of autoencoder models implies that the main training metric is formulated as the error between input images and their corresponding reconstructions. Different reconstruction loss variations and latent space regularizations have been shown to improve model performances depending on the tasks to solve and to induce new desirable properties such as disentanglement. Nevertheless, measuring the success in, or enforcing properties by, the input pixel space is a challenging endeavour. In this work, we want to make use of the available data more efficiently and provide design choices to be considered in the recording or generation of future datasets to implicitly induce desirable properties during training. To this end, we propose a new sampling technique which matches semantically important parts of the image while randomizing the other parts, leading to salient feature extraction and a neglection of unimportant details. The proposed method can be combined with any existing reconstruction loss and the performance gain is superior to the triplet loss. We analyse the resulting properties on various datasets and show improvements on several computer vision tasks: illumination and unwanted features can be normalized or smoothed out and shadows are removed such that classification or other tasks work more reliably; a better invariances with respect to unwanted features is induced; the generalization capacities from synthetic to real images is improved, such that more of the semantics are preserved; uncertainty estimation is superior to Monte Carlo Dropout and an ensemble of models, particularly for datasets of higher visual complexity. Finally, classification accuracy by means of simple linear classifiers in the latent space is improved compared to the triplet loss. For each task, the improvements are highlighted on several datasets commonly used by the research community, as well as in automotive applications.

Using Swarm Optimization To Enhance Autoencoders Images

Optimizing Convolutional Neural Network Hyperparameters by Enhanced Swarm Intelligence Metaheuristics

Steered Mixture-of-Experts Autoencoder Design for Real-Time Image Modelling and Denoising

Cascade Decoders-Based Autoencoders for Image Reconstruction

Improving The Reconstruction Quality by Overfitted Decoder Bias in Neural Image Compression

EncodeNet: A Framework for Boosting DNN Accuracy with Entropy-driven Generalized Converting Autoencoder

Denoising Masked Autoencoders Help Robust Classification.

A Particle Swarm Optimization-based Flexible Convolutional Auto-Encoder for Image Classification

A comparative Analysis of Image Enhancement Using Autoencoders and Generative Adversarial Networks

Autoencoder and Partially Impossible Reconstruction Losses

Optimizations of Autoencoders for Analysis and Classification of Microscopic In Situ Hybridization Images

Image Quality Assessment Techniques Show Improved Training and Evaluation of Autoencoder Generative Adversarial Networks

Neural Architecture Search using Particle Swarm and Ant Colony Optimization

Robustly overfitting latents for flexible neural image compression

An efficient automated image caption generation by the encoder decoder model

Training Stacked Denoising Autoencoders for Representation Learning

Efficient Image Pre-Training with Siamese Cropped Masked Autoencoders

Assessment of Optimizers impact on Image Recognition with Convolutional Neural Network to Adversarial Datasets

Optimization of Convolutional Neural Network Using the Linearly Decreasing Weight Particle Swarm Optimization

Enhancing a Convolutional Autoencoder with a Quantum Approximate Optimization Algorithm for Image Noise Reduction

Sample what you cant compress