FlyTransformer: A Cross-Modal Fusion Policy for UAV End-to-End Trajectory Planning

Wenxiang Shi,Chen Zhao,Kailei Tang,Junru Sheng,Zhiyan Dong,Lihua Zhang,Xiaoyang Kang,Kai Cao
DOI: https://doi.org/10.1109/smc53992.2023.10393990
2023-01-01
Abstract:The ability to perform efficient trajectory planning is crucial for UAV to carry out tasks autonomously. However, existing research on UAV trajectory planning often employs the cascade process method that involves high-precision maps, real-time positioning and path planning. These methods have limitations such as high computational complexity and time delay, which hinder the efficiency of trajectory planning. End-to-end trajectory planning methods offer a promising solution to this problem. As the core of these end-to-end methods, perception-end plays a decisive role in trajectory planning. But current multimodal fusion of perception is only post-fusion, lacks intermediate feature-level fusion and lacks attention to global visuospatial information. To solve these problems, we propose a new network architecture called FlyTransformer, which fuses the proprioceptive state and visual perception in feature-level for end-to-end trajectory planning. And the key visuospatial information can be attentioned in this architecture. We evaluate our method in forest and cuboid scenarios and their corresponding outdoor scenarios. The results show that FlyTransformer outperforms other baseline algorithms in terms of efficiency and performance.
What problem does this paper attempt to address?