WB LUTs: Contrastive Learning for White Balancing Lookup Tables

Sai Kumar Reddy Manne,Michael Wan
2024-04-16
Abstract:Automatic white balancing (AWB), one of the first steps in an integrated signal processing (ISP) pipeline, aims to correct the color cast induced by the scene illuminant. An incorrect white balance (WB) setting or AWB failure can lead to an undesired blue or red tint in the rendered sRGB image. To address this, recent methods pose the post-capture WB correction problem as an image-to-image translation task and train deep neural networks to learn the necessary color adjustments at a lower resolution. These low resolution outputs are post-processed to generate high resolution WB corrected images, forming a bottleneck in the end-to-end run time. In this paper we present a 3D Lookup Table (LUT) based WB correction model called WB LUTs that can generate high resolution outputs in real time. We introduce a contrastive learning framework with a novel hard sample mining strategy, which improves the WB correction quality of baseline 3D LUTs by 25.5%. Experimental results demonstrate that the proposed WB LUTs perform competitively against state-of-the-art models on two benchmark datasets while being 300 times faster using 12.7 times less memory. Our model and code are available at
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper attempts to address the issue of automatic white balance (AWB) in image processing. Specifically, existing AWB methods are inefficient when handling high-resolution images, especially in real-time applications. To improve this situation, the authors propose a white balance correction model based on 3D lookup tables (3D LUTs) (WB LUTs) and enhance the quality of white balance correction through a contrastive learning framework and a new hard sample mining strategy. ### Main Issues 1. **Bottlenecks of Existing Methods**: - Existing AWB methods typically correct white balance through image-to-image transformation tasks. These methods are efficient when processing low-resolution images but require additional post-processing steps to generate high-resolution images, leading to poor real-time performance. - These post-processing steps (such as edge-aware upsampling and color mapping) become bottlenecks for real-time processing. 2. **Limitations of 3D LUTs**: - Traditional 3D LUTs require manual tuning, lack generalization ability, and cannot adapt to different scenes and lighting conditions. - When using 3D LUTs directly for white balance correction, the model struggles to generate rich illumination-guided representations without direct supervision. ### Solutions 1. **3D LUTs Model**: - A white balance correction model based on 3D LUTs (WB LUTs) is proposed, which can directly generate high-resolution white balance corrected images, avoiding post-processing steps. - The quality of the model's white balance correction is improved through a contrastive learning framework and a hard sample mining strategy. 2. **Contrastive Learning Framework**: - A contrastive learning framework is introduced, encouraging the model to learn illumination-guided, scene-independent features by comparing anchor samples, positive samples, and negative samples. - A hard sample mining strategy is used to generate more challenging positive and negative samples, further enhancing the model's performance. ### Experimental Results - Experimental results show that the proposed WB LUTs model performs comparably to the current state-of-the-art models on two benchmark datasets but with a 300-fold increase in speed and a 12.7-fold reduction in memory usage. - Under different color temperature conditions, the WB LUTs model can maintain a low ΔE2000 value, demonstrating consistent white balance correction effects. ### Conclusion - The authors propose a white balance correction model based on 3D LUTs (WB LUTs), significantly improving the quality and efficiency of white balance correction through a contrastive learning framework and a hard sample mining strategy. - The model performs excellently in real-time applications, significantly reducing computational resource consumption while ensuring high quality.