Handheld Multi-Frame Super-Resolution

Bartlomiej Wronski,Ignacio Garcia-Dorado,Manfred Ernst,Damien Kelly,Michael Krainin,Chia-Kai Liang,Marc Levoy,Peyman Milanfar
DOI: https://doi.org/10.1145/3306346.3323024
2021-02-17
Abstract:Compared to DSLR cameras, smartphone cameras have smaller sensors, which limits their spatial resolution; smaller apertures, which limits their light gathering ability; and smaller pixels, which reduces their signal-to noise ratio. The use of color filter arrays (CFAs) requires demosaicing, which further degrades resolution. In this paper, we supplant the use of traditional demosaicing in single-frame and burst photography pipelines with a multiframe super-resolution algorithm that creates a complete RGB image directly from a burst of CFA raw images. We harness natural hand tremor, typical in handheld photography, to acquire a burst of raw frames with small offsets. These frames are then aligned and merged to form a single image with red, green, and blue values at every pixel site. This approach, which includes no explicit demosaicing step, serves to both increase image resolution and boost signal to noise ratio. Our algorithm is robust to challenging scene conditions: local motion, occlusion, or scene changes. It runs at 100 milliseconds per 12-megapixel RAW input burst frame on mass-produced mobile phones. Specifically, the algorithm is the basis of the Super-Res Zoom feature, as well as the default merge method in Night Sight mode (whether zooming or not) on Google's flagship phone.
Computer Vision and Pattern Recognition,Image and Video Processing
What problem does this paper attempt to address?
This paper aims to solve the limitations in the hardware design of smartphone cameras, especially the problem of limited spatial resolution. Specifically, the paper proposes a handheld multi - frame super - resolution algorithm. This algorithm utilizes multi - frame image merging technology to directly generate a complete RGB image from a set of original RAW images, thus replacing the demosaicing step in the traditional single - frame or continuous - shooting photography pipeline. This method not only improves the image resolution but also enhances the signal - to - noise ratio. ### Main Problems 1. **Limited Spatial Resolution**: Smartphone cameras have limited spatial resolution due to small sensor size, small aperture and small pixel size. In addition, the use of color filter arrays (such as Bayer arrays) requires demosaicing, which further reduces the resolution. 2. **Limitations of Demosaicing**: Traditional demosaicing algorithms assume that the color in a certain area of the image is relatively constant, but in some cases, this assumption does not hold, resulting in reduced resolution and loss of details. 3. **Noise in Low - light Conditions**: In low - light conditions, the image has a large amount of noise and requires an effective noise reduction method. ### Solutions The multi - frame super - resolution algorithm proposed in the paper solves the above problems in the following ways: - **Multi - frame Image Merging**: Utilize the natural hand - shake during handheld shooting to obtain a set of RAW image frames with slight offsets. These frames are aligned and merged to form a high - resolution RGB image. - **No Explicit Demosaicing Required**: The algorithm directly reconstructs a complete RGB image from multi - frame RAW images, avoiding the resolution loss caused by the traditional demosaicing step. - **High Adaptability**: The algorithm can handle local motion, occlusion and scene changes, and can maintain good performance even in complex scenes. - **Low Latency**: The algorithm runs on mobile devices and only takes 100 milliseconds to process each 12 - million - pixel RAW image frame, achieving low - latency real - time processing. ### Main Contributions 1. **Multi - frame Super - resolution Algorithm Replacing Demosaicing**: A new multi - frame super - resolution algorithm is proposed to directly generate high - resolution RGB images from RAW images. 2. **Adaptive Kernel Interpolation / Merging Method**: An adaptive kernel interpolation method based on local structure is introduced to improve the reconstruction accuracy. 3. **Motion Robustness Model**: A motion robustness model is developed to enable the algorithm to work normally in scenes containing local motion, occlusion and alignment failure. 4. **Analysis of Natural Hand - shake**: The effectiveness of natural hand - shake as a sub - pixel coverage source is analyzed, proving that it is sufficient to support multi - frame super - resolution. Through these methods, the paper provides an effective solution to improve the image quality and resolution of smartphone cameras, especially outstanding in handheld shooting and low - light conditions.