ICE: Implicit Coordinate Encoder for Multiple Image Neural Representation

Fernando Rivas-Manzaneque,Angela Ribeiro,Orlando Avila-Garcia
DOI: https://doi.org/10.1109/TIP.2023.3299501
Abstract:In recent years, implicit neural representations (INR) have shown their great potential to solve many computer graphics and computer vision problems. With this technique, signals such as 2D images or 3D shapes can be fit by training multi-layer perceptrons (MLP) on continuous functions, providing many advantages over conventional discrete representations. Despite being considered a promising approach to 2D image encoding and compression, the application of INR to image collections remains a challenge, since the number of parameters needed rapidly grow with the number of images. In this paper, we propose a fully implicit approach to INR which drastically reduces the size of MLP models in multiple image representation tasks. We introduce the concept of implicit coordinate encoder (ICE) and show it can be used to scale INR with the image number; specifically, by learning a common feature space between images. Furthermore, we show that our method is valid not only for image collections but also for large (gigapixel) images by applying a "divide-and-conquer" strategy. We propose an auto-encoder deep neural network architecture, with a single ICE (encoder) and multiple MLP (decoders), which are jointly trained following a multi-task learning strategy. We demonstrate the benefits coming from ICE when it is implemented as a one-dimensional convolutional encoder, including a better performance of the downstream MLP models with an order of magnitude fewer parameters. Our method is the first one to make use of convolutional blocks in INR networks, unlike the conventional approach of using MLP architectures only. We show the benefits of ICE in two experimental scenarios: a collection of twenty-four small ( 768×512 ) images (Kodak dataset), and a single large ( 3072×3072 ) image (dwarf planet Pluto), achieving better quality than previous fully-implicit methods, using up to 50% fewer parameters.
What problem does this paper attempt to address?