Vision-Language Modeling with Regularized Spatial Transformer Networks for All Weather Crosswind Landing of Aircraft

Debabrata Pal,Anvita Singh,Saumya Saumya,Shouvik Das
2024-05-09
Abstract:The intrinsic capability to perceive depth of field and extract salient information by the Human Vision System (HVS) stimulates a pilot to perform manual landing over an autoland approach. However, harsh weather creates visibility hindrances, and a pilot must have a clear view of runway elements before the minimum decision altitude. To help a pilot in manual landing, a vision-based system tailored to localize runway elements likewise gets affected, especially during crosswind due to the projective distortion of aircraft camera images. To combat this, we propose to integrate a prompt-based climatic diffusion network with a weather distillation model using a novel diffusion-distillation loss. Precisely, the diffusion model synthesizes climatic-conditioned landing images, and the weather distillation model learns inverse mapping by clearing those visual degradations. Then, to tackle the crosswind landing scenario, a novel Regularized Spatial Transformer Networks (RuSTaN) learns to accurately calibrate for projective distortion using self-supervised learning, which minimizes localization error by the downstream runway object detector. Finally, we have simulated a clear-day landing scenario at the busiest airport globally to curate an image-based Aircraft Landing Dataset (AIRLAD) and experimentally validated our contributions using this dataset to benchmark the performance.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the issue of visual impairment during manual aircraft landing in adverse weather conditions and to improve the accuracy of runway element recognition. Specifically, the paper proposes a new method to tackle the following two main challenges: 1. **Visual Degradation in Adverse Weather**: - In extreme weather conditions (such as fog, rain, snow, etc.), the pilot's visibility is severely affected, making it difficult to see runway elements before reaching the minimum decision height. To assist pilots in manual landing, the paper proposes a method that combines a hint-based climate diffusion network with a weather distillation model. This method uses a novel diffusion-distillation loss function to generate and clear landing images under adverse weather conditions. 2. **Projection Distortion During Crosswind Landing**: - Due to crosswind landing and flight control, the vertical axis of the aircraft camera image may no longer be parallel to the runway, causing image distortion. To correct this distortion, the paper introduces Regularized Spatial Transformer Networks (RUSTAN). By optimizing the localization network in existing spatial transformer networks through self-supervised learning, RUSTAN predicts the optimal affine parameters in real-time situations, making the vertical axis of the distorted image parallel to the runway, thereby reducing the localization error of downstream runway object detectors. Additionally, the paper constructs an Aircraft Landing Dataset (AIRLAD) based on simulated images to evaluate the performance of the proposed visual landing system at busy airports (such as Hartsfield-Jackson Atlanta International Airport). Through these methods, the paper aims to enhance the safety and reliability of manual aircraft landing in adverse weather conditions.