OpenAnnotate2: Multi-Modal Auto-Annotating for Autonomous Driving

Yijie Zhou,Likun Cai,Xianhui Cheng,Qiming Zhang,Xiangyang Xue,Wenchao Ding,Jian Pu
DOI: https://doi.org/10.1109/tiv.2024.3381602
IF: 8.2
2024-01-01
IEEE Transactions on Intelligent Vehicles
Abstract:The demand for high-quality annotated data has surged in recent years for applications driven by real-world artificial intelligence, such as autonomous driving and embodied intelligence. Consequently, the development of a tool that can assist humans in the highly automated and high-quality annotation of large-scale, multi-modal data is of significant importance and urgency for both academic research and practical applications. Most existing multi-modal data annotation tools require frame-by-frame, object-by-object annotation with keyboard and mouse, making it challenging to provide high-quality and real-time annotations for 2D images and 3D point clouds in highly open scenarios like autonomous driving. To address these challenges, we propose OpenAnnotate2, which understands human intentions based on natural language prompt, and formulates plans to decompose and execute complex multi-modal data annotation tasks. Additionally, the tool can continually enhance its cognitive and annotation capabilities with minimal human-computer interaction, through an ever-updating external knowledge base. This significantly simplifies the annotation workflow, paving the way for the creation of massive datasets suitable for large-scale visual models. The source code will be released at https://github.com/Fudan-ProjectTitan/OpenAnnotate .
What problem does this paper attempt to address?