1st Solution Places for CVPR 2023 UG$^{\textbf{2}}$+ Challenge Track 2.1-Text Recognition through Atmospheric Turbulence

Shengqi Xu,Xueyao Xiao,Shuning Cao,Yi Chang,Luxin Yan
2023-06-15
Abstract:In this technical report, we present the solution developed by our team VIELab-HUST for text recognition through atmospheric turbulence in Track 2.1 of the CVPR 2023 UG$^{2}$+ challenge. Our solution involves an efficient multi-stage framework that restores a high-quality image from distorted frames. Specifically, a frame selection algorithm based on sharpness is first utilized to select the sharpest set of distorted frames. Next, each frame in the selected frames is aligned to suppress geometric distortion through optical-flow-based image registration. Then, a region-based image fusion method with DT-CWT is utilized to mitigate the blur caused by the turbulence. Finally, a learning-based deartifacts method is applied to remove the artifacts in the fused image, generating a high-quality outuput. Our framework can handle both hot-air text dataset and turbulence text dataset provided in the final testing phase and achieved 1st place in text recognition accuracy. Our code will be available at <a class="link-external link-https" href="https://github.com/xsqhust/Turbulence_Removal" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper primarily addresses the problem of text recognition under the influence of atmospheric turbulence. Specifically, the research team proposed a solution for the "Text Recognition through Atmospheric Turbulence" task in Track 2.1 of the CVPR 2023 UG2+ Challenge. They designed an efficient multi-stage framework to recover high-quality images from turbulence-distorted image sequences, enabling subsequent text recognition systems to successfully identify the text in the restored images. The framework includes the following four main steps: 1. **Frame Selection**: A sharpness-based frame selection algorithm is used to select the clearest set of frames. 2. **Image Registration**: Optical flow estimation is performed on the selected frames to align them, thereby suppressing geometric distortions. 3. **Image Fusion**: A region-based image fusion method with Dual-Tree Complex Wavelet Transform (DT-CWT) is employed to mitigate turbulence-induced blur. 4. **Artifact Removal**: A learning-based method is applied to remove artifacts from the fused image, further enhancing image quality. This framework effectively handles two types of text datasets: the hot air text dataset and the turbulence text dataset, achieving first place in text recognition accuracy during the final testing phase.