Deformable Convolutions and LSTM-based Flexible Event Frame Fusion Network for Motion Deblurring

Dan Yang,Mehmet Yamac

2023-06-01

Abstract:Event cameras differ from conventional RGB cameras in that they produce asynchronous data sequences. While RGB cameras capture every frame at a fixed rate, event cameras only capture changes in the scene, resulting in sparse and asynchronous data output. Despite the fact that event data carries useful information that can be utilized in motion deblurring of RGB cameras, integrating event and image information remains a challenge. Recent state-of-the-art CNN-based deblurring solutions produce multiple 2-D event frames based on the accumulation of event data over a time period. In most of these techniques, however, the number of event frames is fixed and predefined, which reduces temporal resolution drastically, particularly for scenarios when fast-moving objects are present or when longer exposure times are required. It is also important to note that recent modern cameras (e.g., cameras in mobile phones) dynamically set the exposure time of the image, which presents an additional problem for networks developed for a fixed number of event frames. A Long Short-Term Memory (LSTM)-based event feature extraction module has been developed for addressing these challenges, which enables us to use a dynamically varying number of event frames. Using these modules, we constructed a state-of-the-art deblurring network, Deformable Convolutions and LSTM-based Flexible Event Frame Fusion Network (DLEFNet). It is particularly useful for scenarios in which exposure times vary depending on factors such as lighting conditions or the presence of fast-moving objects in the scene. It has been demonstrated through evaluation results that the proposed method can outperform the existing state-of-the-art networks for deblurring task in synthetic and real-world data sets.

Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning

What problem does this paper attempt to address?

The paper aims to address the problem of image deblurring in dynamic scenes, particularly by utilizing data from event cameras to enhance deblurring performance. Specifically, the paper proposes solutions to the following challenges: 1. **Fusion of Event Data and RGB Image Data**: The data sequence generated by event cameras is asynchronous and only captures changes in the scene, resulting in sparse and asynchronous data output. Although event data is very useful for motion deblurring in RGB cameras, effectively combining event data with image information remains a challenge. 2. **Fixed Number of Event Frames Limitation**: Existing convolutional neural network (CNN)-based deblurring solutions typically assume a fixed exposure time and create multiple 2D event frames based on this assumption. However, in these techniques, the number of event frames is usually fixed, which significantly reduces temporal resolution, especially when dealing with fast-moving objects or requiring longer exposure times. 3. **Dynamic Exposure Time of Modern Cameras**: Modern cameras (such as those in smartphones) can dynamically set exposure times, posing an additional challenge to networks that assume a fixed number of event frames. To address the above issues, the authors propose a new network architecture—Deformable Convolution and LSTM-based Flexible Event Frame Fusion Network (DLEFNet). This network uses Long Short-Term Memory (LSTM) units for feature extraction and deformable convolutional neural networks (CNNs) to handle dynamically varying numbers of event frames. Additionally, DLEFNet incorporates encoded features of RGB frames at multiple scales to build an advanced deblurring network. This network is particularly suitable for scenarios involving fast-moving objects and varying exposure times. Experimental results show that the proposed DLEFNet method outperforms existing state-of-the-art networks on both the synthetic dataset GoPro and the real-world dataset REBlur.

Deformable Convolutions and LSTM-based Flexible Event Frame Fusion Network for Motion Deblurring

MEFNet: Multi-scale Event Fusion Network for Motion Deblurring

Event-Based Fusion for Motion Deblurring with Cross-modal Attention

A Residual Learning Approach to Deblur and Generate High Frame Rate Video with an Event Camera

Beyond Camera Motion Removing: How to Handle Outliers in Deblurring

Learning for Motion Deblurring with Hybrid Frames and Events

Learning to Deblur and Generate High Frame Rate Video with an Event Camera

Image Deblurring Utilizing Inertial Sensors and a Short-Long-Short Exposure Strategy.

Unifying Motion Deblurring and Frame Interpolation with Events

Learning Event-Based Motion Deblurring

Motion Deblurring via Spatial-Temporal Collaboration of Frames and Events

Deblurring Low-Light Images with Events

Learning an Occlusion-Aware Network for Video Deblurring

Event-based Image Deblurring with Dynamic Motion Awareness

Study of unscheduled DNA synthesis following exposure of human cells to arecoline and extracts of betel nut in vitro.

Bortezomib as brief neoadjuvant therapy for localized high-risk prostate cancer (PCa) followed by radical prostatectomy (RP).

DeLiEve-Net: Deblurring Low-light Images with Light Streaks and Local Events

Motion Aware Event Representation-Driven Image Deblurring

Towards Real-world Event-guided Low-light Video Enhancement and Deblurring

Efficient Multi-scale Network with Learnable Discrete Wavelet Transform for Blind Motion Deblurring

Online Video Deblurring via Dynamic Temporal Blending Network