Labits: Layered Bidirectional Time Surfaces Representation for Event Camera-based Continuous Dense Trajectory Estimation

Zhongyang Zhang,Jiacheng Qiu,Shuyang Cui,Yijun Luo,Tauhidur Rahman
2024-12-12
Abstract:Event cameras provide a compelling alternative to traditional frame-based sensors, capturing dynamic scenes with high temporal resolution and low latency. Moving objects trigger events with precise timestamps along their trajectory, enabling smooth continuous-time estimation. However, few works have attempted to optimize the information loss during event representation construction, imposing a ceiling on this task. Fully exploiting event cameras requires representations that simultaneously preserve fine-grained temporal information, stable and characteristic 2D visual features, and temporally consistent information density, an unmet challenge in existing representations. We introduce Labits: Layered Bidirectional Time Surfaces, a simple yet elegant representation designed to retain all these features. Additionally, we propose a dedicated module for extracting active pixel local optical flow (APLOF), significantly boosting the performance. Our approach achieves an impressive 49% reduction in trajectory end-point error (TEPE) compared to the previous state-of-the-art on the MultiFlow dataset. The code will be released upon acceptance.
Computer Vision and Pattern Recognition,Artificial Intelligence,Emerging Technologies
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that the existing representation methods of event cameras in continuous dense trajectory estimation tasks cannot fully utilize the advantages of event cameras, resulting in information loss and limited performance. Specifically: 1. **Limitations of existing representation methods**: - **Loss of temporal information**: Many existing representation methods (such as event frames, event counts, time surfaces, etc.) lose fine - grained temporal information during the conversion process, which makes it difficult for them to capture continuous dense trajectories. - **Inconsistent spatial and temporal features**: Although some methods retain a certain amount of temporal information, they are unstable in terms of spatial features, resulting in the inability to maintain temporal and spatial consistency simultaneously. - **Uneven information density**: Existing representation methods often focus too much on recent events and ignore the information of early events, resulting in uneven information density. 2. **Specific description of the problem**: - In order to achieve continuous dense trajectory estimation, a representation method that can simultaneously retain fine - grained temporal information, stable and characteristic 2D visual features, and temporal consistency information is required. - Existing representation methods cannot meet these requirements, especially when dealing with low - level vision tasks, which limits the potential of event cameras. 3. **The new method proposed in the paper**: - The paper introduces a new representation method - Layered Bidirectional Time Surfaces (Labits), aiming to solve the above problems. - Labits not only retains fine - grained temporal information but also ensures that the motion trend information at each pixel location is completely preserved through the way of layered bidirectional time surfaces. - At the same time, the paper also proposes a module specifically used for extracting Active Pixel Local Optical Flow (APLOF), which further improves the performance of trajectory estimation. 4. **The effect of the improvement**: - Using the Labits representation method can significantly reduce the Trajectory End - Point Error (TEPE). Compared with the previous state - of - the - art method, a 49% error reduction is achieved on the MultiFlow dataset. - By combining Labits and the APLOF extractor, higher accuracy and better performance can be obtained in dense trajectory estimation tasks. In summary, this paper aims to overcome the deficiencies of existing methods in terms of temporal information retention, spatial feature stability, and information density by designing a new event representation method (Labits), thereby improving the performance of event cameras in continuous dense trajectory estimation tasks.