Monocular Navigation in Large Scale Dynamic Environments

Darius Burschka
DOI: https://doi.org/10.48550/arXiv.1709.02285
2017-09-07
Abstract:We present a processing technique for a robust reconstruction of motion properties for single points in large scale, dynamic environments. We assume that the acquisition camera is moving and that there are other independently moving agents in a large environment, like road scenarios. The separation of direction and magnitude of the reconstructed motion allows for robust reconstruction of the dynamic state of the objects in situations, where conventional binocular systems fail due to a small signal (disparity) from the images due to a constant detection error, and where structure from motion approaches fail due to unobserved motion of other agents between the camera frames. We present the mathematical framework and the sensitivity analysis for the resulting system.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper attempts to solve the problem of using a monocular camera for navigation in large - scale dynamic environments. Specifically, the paper focuses on how to robustly reconstruct the motion characteristics of multiple independent moving objects in complex scenes through the data obtained by a monocular camera. This includes the direction and speed of the objects, as well as their relative motion relationships with the camera. ### Main contributions of the paper: 1. **Propose a new processing technique**: It is used to robustly reconstruct the motion attributes of a single point from a monocular image sequence in large - scale dynamic environments. This method is especially suitable for cases where the camera is moving and there are other independent moving objects in the environment. 2. **Solve the limitations of traditional methods**: - **Binocular system**: In the case of small signals (parallax), due to the existence of detection errors, the binocular system is difficult to accurately reconstruct motion. - **Structure - from - motion method**: When other objects have unobserved motions between camera frames, the structure - from - motion method will also fail. 3. **Mathematical framework and sensitivity analysis**: The paper details the mathematical framework of the proposed system and conducts a sensitivity analysis to evaluate the performance and robustness of the system. ### Specific technical details: - **Motion separation**: By separating the direction and magnitude of motion, the method proposed in the paper can still accurately reconstruct the dynamic state of an object when traditional methods fail. - **Time - to - Collision (TTC) method**: The paper introduces the concept of time - to - collision. By analyzing the motion trajectories of individual points, it estimates the relative distances and motion relationships between these points and the camera. This method does not require additional external calibration data and can directly calculate results from image information. - **Special case handling**: - **Planar motion**: For motion completely on a horizontal plane, such as motion on an office floor or a road, the vanishing point (epipole) can be found through simple geometric relationships, thereby estimating the motion parameters. - **Direct - collision candidate points**: When an observation point is exactly at the vanishing point of the current motion, it can be regarded as a static target for defining potential collision objects. - **Multi - point motion**: For multiple points on a rigid body, the motion parameters of the entire object can be estimated through their motion directions and positions. ### Experimental results: - **Experimental verification**: The paper implemented this framework on the Linux system and used the AKAZE features in OpenCV to estimate sparse optical flow. The experimental results show that this method can still provide useful results at long distances, while the traditional binocular stereo vision method will fail in these cases due to depth - estimation errors. - **Simulated scenes**: The effectiveness of the method was verified through simulated scenes. In particular, in dynamic scenes, it can accurately calculate collision relationships under different speed changes, providing support for collision - avoidance planning. ### Conclusion: The paper proposes a new method for using a monocular camera for navigation in large - scale dynamic environments. Through the concept of time - to - collision, it can robustly reconstruct the motion characteristics of objects. This method not only does not require additional external calibration data but also performs better than the traditional binocular stereo vision method at long distances, providing strong support for autonomous driving and collision - avoidance applications.