ExpoMamba: Exploiting Frequency SSM Blocks for Efficient and Effective Image Enhancement

Eashan Adhikarla,Kai Zhang,John Nicholson,Brian D. Davison
2024-08-19
Abstract:Low-light image enhancement remains a challenging task in computer vision, with existing state-of-the-art models often limited by hardware constraints and computational inefficiencies, particularly in handling high-resolution images. Recent foundation models, such as transformers and diffusion models, despite their efficacy in various domains, are limited in use on edge devices due to their computational complexity and slow inference times. We introduce ExpoMamba, a novel architecture that integrates components of the frequency state space within a modified U-Net, offering a blend of efficiency and effectiveness. This model is specifically optimized to address mixed exposure challenges, a common issue in low-light image enhancement, while ensuring computational efficiency. Our experiments demonstrate that ExpoMamba enhances low-light images up to 2-3x faster than traditional models with an inference time of 36.6 ms and achieves a PSNR improvement of approximately 15-20% over competing models, making it highly suitable for real-time image processing applications.
Computer Vision and Pattern Recognition,Artificial Intelligence,Multimedia,Image and Video Processing
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to address the challenging issues in Low-Light Image Enhancement (LLIE), particularly the limitations of hardware and computational efficiency when dealing with high-resolution images. Specifically: 1. **Limitations of Existing Methods**: - Current state-of-the-art models (such as Transformers and diffusion models) perform well in multiple domains, but their computational complexity and long inference times limit their application on edge devices. - Traditional low-light image enhancement techniques struggle to balance processing speed and high-quality results when handling high-resolution images, leading to issues like noise and color distortion. 2. **Proposed Solution**: - Introduces a new architecture called ExpoMamba, which combines Frequency State Space Block (FSSB) with an improved U-Net to achieve efficient and effective low-light image enhancement. - Specifically addresses the mixed exposure challenge (i.e., the presence of both underexposed and overexposed areas within the same image frame), improving image quality while maintaining computational efficiency. - Experiments show that ExpoMamba is 2-3 times faster in inference time (36.6 milliseconds) compared to traditional models and improves Peak Signal-to-Noise Ratio (PSNR) by approximately 15-20% over competing models, making it highly suitable for real-time image processing applications. Through these improvements, ExpoMamba aims to provide a faster and more efficient solution for low-light image enhancement, particularly suitable for applications such as mobile photography and real-time video streaming.