DifuzCam: Replacing Camera Lens with a Mask and a Diffusion Model

Erez Yosef,Raja Giryes
2024-08-14
Abstract:The flat lensless camera design reduces the camera size and weight significantly. In this design, the camera lens is replaced by another optical element that interferes with the incoming light. The image is recovered from the raw sensor measurements using a reconstruction algorithm. Yet, the quality of the reconstructed images is not satisfactory. To mitigate this, we propose utilizing a pre-trained diffusion model with a control network and a learned separable transformation for reconstruction. This allows us to build a prototype flat camera with high-quality imaging, presenting state-of-the-art results in both terms of quality and perceptuality. We demonstrate its ability to leverage also textual descriptions of the captured scene to further enhance reconstruction. Our reconstruction method which leverages the strong capabilities of a pre-trained diffusion model can be used in other imaging systems for improved reconstruction results.
Computer Vision and Pattern Recognition,Artificial Intelligence,Image and Video Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to reconstruct high - quality images from the measurement data of flat lensless cameras**. Specifically, traditional flat lensless cameras use diffractors or other optical elements instead of traditional lenses to significantly reduce the size and weight of the camera. However, this design results in unsatisfactory image reconstruction quality, making it difficult to obtain clear and accurate images. To solve this problem, the author proposes a new method named **DifuzCam**. This method utilizes pre - trained diffusion models, ControlNets, and learned separable transformations for image reconstruction. This method can not only improve the quality of image reconstruction, but also further enhance the reconstruction effect through text guidance. ### Main contributions 1. **Proposed a new computational photography method based on diffusion models**: for reconstructing high - quality images from the measurement data of flat lensless cameras. 2. **Achieved state - of - the - art reconstruction quality results on all evaluation metrics**. 3. **Introduced text - guidance technology**: to improve image reconstruction results by describing the text information of the captured scene. 4. **Proposed a deep control network with intermediate separable losses**: to improve convergence and reconstruction results. ### Method overview The workflow of DifuzCam is as follows: - **Input data**: The original sensor measurement data captured by the flat lensless camera. - **Separable transformation**: Convert the measurement data into a form suitable for processing by the diffusion model. - **Control network**: Generate images through the pre - trained diffusion model and adjust the generation process through the control network. - **Text - guidance (optional)**: Provide a text description of the captured scene to further optimize the reconstruction results. Through these innovations, DifuzCam not only improves the quality of image reconstruction, but also shows its potential application value in other imaging systems.