DeepKalPose: An Enhanced Deep-Learning Kalman Filter for Temporally Consistent Monocular Vehicle Pose Estimation

Leandro Di Bella,Yangxintong Lyu,Adrian Munteanu

DOI: https://doi.org/10.1049/ell2.13191

2024-04-25

Abstract:This paper presents DeepKalPose, a novel approach for enhancing temporal consistency in monocular vehicle pose estimation applied on video through a deep-learning-based Kalman Filter. By integrating a Bi-directional Kalman filter strategy utilizing forward and backward time-series processing, combined with a learnable motion model to represent complex motion patterns, our method significantly improves pose accuracy and robustness across various conditions, particularly for occluded or distant vehicles. Experimental validation on the KITTI dataset confirms that DeepKalPose outperforms existing methods in both pose accuracy and temporal consistency.

Computer Vision and Pattern Recognition,Artificial Intelligence,Robotics

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the temporal consistency problem of vehicle 6D pose estimation in monocular videos. Specifically, traditional image - based methods, when processing video data, process each frame independently, resulting in temporal inconsistency. Especially when dealing with occluded or distant vehicles, the predicted poses will have large fluctuations and instability (such as flickering artifacts). These problems are mainly due to the lack of constraints on inter - frame continuity and consistency in traditional methods. To solve these problems, the authors propose a new method named DeepKalPose. By combining deep learning and the Kalman Filter, it aims to improve the temporal consistency of vehicle pose estimation, thereby enhancing the accuracy and robustness of pose estimation. The following are the main contributions of this method: 1. **Bidirectional Kalman filtering strategy**: DeepKalPose adopts a bidirectional Kalman filtering strategy, using forward and backward time - series processing, enabling the image - based pose estimator to better handle video data. 2. **Learnable motion model**: A learnable motion model is introduced and integrated into the Kalman Filter, allowing the network to learn more complex and nonlinear vehicle motion patterns. 3. **Experimental verification**: The experimental results on the KITTI dataset show that DeepKalPose outperforms existing image - based techniques in both pose accuracy and temporal consistency, especially when dealing with occluded and distant vehicles. Through these improvements, DeepKalPose effectively solves the temporal inconsistency problem existing in traditional methods and improves the stability and accuracy of vehicle pose estimation.

DeepKalPose: An Enhanced Deep-Learning Kalman Filter for Temporally Consistent Monocular Vehicle Pose Estimation

DeepKalPose: An enhanced deep‐learning Kalman filter for temporally consistent monocular vehicle pose estimation

BB-Align: A Lightweight Pose Recovery Framework for Vehicle-to-Vehicle Cooperative Perception

Stereo Visual-Inertial Odometry with Multiple Kalman Filters Ensemble

Temporal Consistent Object Pose Estimation from Monocular Videos

ACK-MSCKF: Tightly-Coupled Ackermann Multi-State Constraint Kalman Filter for Autonomous Vehicle Localization

FlexKalmanNet: A Modular AI-Enhanced Kalman Filter Framework Applied to Spacecraft Motion Estimation

Back to the Future: Joint Aware Temporal Deep Learning 3D Human Pose Estimation

DeepAVO: Efficient Pose Refining with Feature Distilling for Deep Visual Odometry

Depth-aware Imbalance Learning for Monocular 6dof Vehicle Pose Estimation

DeepVO: A Deep Learning approach for Monocular Visual Odometry

MOTPose: Multi-object 6D Pose Estimation for Dynamic Video Sequences using Attention-based Temporal Fusion

MuDeepNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose Using Multi-view Consistency Loss

Visual Odometry with Deep Bidirectional Recurrent Neural Networks.

Learning Visual-Inertial Odometry With Robocentric Iterated Extended Kalman Filter

Multi-vehicle Detection and Tracking Based on Kalman Filter and Data Association

TLIO: Tight Learned Inertial Odometry

PO-MSCKF: An Efficient Visual-Inertial Odometry by Reconstructing the Multi-State Constrained Kalman Filter with the Pose-only Theory

Kalman filter-based deep fused architecture for knee angle estimation

Deep Dual Consecutive Network for Human Pose Estimation

Twinvo: Unsupervised Learning of Monocular Visual Odometry Using Bi-Direction Twin Network