Abstract:Capturing high dynamic range (HDR) images (videos) is attractive because it can reveal the details in both dark and bright regions. Since the mainstream screens only support low dynamic range (LDR) content, tone mapping algorithm is required to compress the dynamic range of HDR images (videos). Although image tone mapping has been widely explored, video tone mapping is lagging behind, especially for the deep-learning-based methods, due to the lack of HDR-LDR video pairs. In this work, we propose a unified framework (IVTMNet) for unsupervised image and video tone mapping. To improve unsupervised training, we propose domain and instance based contrastive learning loss. Instead of using a universal feature extractor, such as VGG to extract the features for similarity measurement, we propose a novel latent code, which is an aggregation of the brightness and contrast of extracted features, to measure the similarity of different pairs. We totally construct two negative pairs and three positive pairs to constrain the latent codes of tone mapped results. For the network structure, we propose a spatial-feature-enhanced (SFE) module to enable information exchange and transformation of nonlocal regions. For video tone mapping, we propose a temporal-feature-replaced (TFR) module to efficiently utilize the temporal correlation and improve the temporal consistency of video tone-mapped results. We construct a large-scale unpaired HDR-LDR video dataset to facilitate the unsupervised training process for video tone mapping. Experimental results demonstrate that our method outperforms state-of-the-art image and video tone mapping methods. Our code and dataset are available at <a class="link-external link-https" href="https://github.com/cao-cong/UnCLTMO" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The main problems that this paper attempts to solve include: 1. **Tone mapping of high - dynamic - range (HDR) images and videos**: Since most display devices only support low - dynamic - range (LDR) content, tone mapping algorithms are required to compress the dynamic range of HDR images or videos for display on LDR screens. Although image tone mapping has been widely studied, research on video tone mapping is relatively lagging behind, especially in deep - learning - based methods. Due to the lack of HDR - LDR video pairs, this problem is more prominent. 2. **Challenges in unsupervised learning**: Traditional supervised learning methods rely on paired HDR - LDR data for training, but it is very difficult to obtain these paired data in practical applications. Therefore, how to effectively train a model without paired supervised data is an important issue. 3. **Temporal consistency problem in video tone mapping**: Unlike static images, video tone mapping needs to maintain temporal consistency, that is, the changes between adjacent frames should be as smooth as possible to avoid flickering and other temporal artifacts. Existing methods often face challenges in achieving this and find it difficult to maintain both temporal and spatial consistency simultaneously. To solve these problems, the authors propose a unified framework (IVTMNet) for unsupervised image and video tone mapping. Specifically, the main contributions of the paper include: - **Network structure**: The spatial feature enhancement (SFE) module and the temporal feature replacement (TFR) module are proposed for image and video tone mapping respectively. The SFE module enhances global features through graph convolution, while the TFR module utilizes temporal correlations to improve the temporal consistency of video results. - **Contrastive learning loss**: Domain - and instance - level contrastive learning losses are introduced to improve the effect of unsupervised training. By constructing positive and negative sample pairs, it is ensured that the generated results are close to high - quality LDR images and far from low - quality LDR images or input HDR images. - **Naturalness loss**: A naturalness loss is proposed to constrain the brightness and contrast of the output image, making it closer to natural images. - **Large - scale unpaired HDR - LDR video dataset**: A large - scale dataset containing real and synthetic HDR - LDR videos is constructed to promote the unsupervised training process. Through these innovations, the paper demonstrates the superior performance of its method on image and video tone mapping tasks and provides new ideas and tools for future research.

Unsupervised HDR Image and Video Tone Mapping via Contrastive Learning

Unsupervised HDR Image and Video Tone Mapping via Contrastive Learning

A Real-Time Semi-Supervised Deep Tone Mapping Network

Explorable Tone Mapping Operators

A Perceptually Optimized and Self-Calibrated Tone Mapping Operator

Deep tone mapping network in HSV color space

Learning-Based Tone Mapping Operator for Efficient Image Matching

Deep Video Inverse Tone Mapping Based on Temporal Clues

Adversarial and Adaptive Tone Mapping Operator for High Dynamic Range Images

Invertible Tone Mapping with Selectable Styles

Perceptually Optimized Deep High-Dynamic-Range Image Tone Mapping

Lightness Modulated Deep Inverse Tone Mapping

CIECAM16-based Tone Mapping of High Dynamic Range Images

Zero-Shot Structure-Preserving Diffusion Model for High Dynamic Range Tone Mapping

Deep Learning Tone-Mapping and Demosaicing for Automotive Vision Systems

High Dynamic Range Image Tone Mapping: Literature review and performance benchmark

Binocular Tone Mapping with Improved Overall Contrast and Local Details

Learning Differential Pyramid Representation for Tone Mapping

MLP Embedded Inverse Tone Mapping

Tone mapping algorithm based on BL-Hilbert-L 2 decomposition model for HDR image

High-Dynamic-Range Tone Mapping in Intelligent Automotive Systems