IKEA Object State Dataset: A 6DoF object pose estimation dataset and benchmark for multi-state assembly objects

Yongzhi Su,Mingxin Liu,Jason Rambach,Antonia Pehrson,Anton Berg,Didier Stricker
DOI: https://doi.org/10.48550/arXiv.2111.08614
2021-11-17
Abstract:Utilizing 6DoF(Degrees of Freedom) pose information of an object and its components is critical for object state detection tasks. We present IKEA Object State Dataset, a new dataset that contains IKEA furniture 3D models, RGBD video of the assembly process, the 6DoF pose of furniture parts and their bounding box. The proposed dataset will be available at <a class="link-external link-https" href="https://github.com/mxllmx/IKEAObjectStateDataset" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in augmented reality (AR) - assisted furniture assembly tasks, how to accurately detect and estimate the six - degree - of - freedom (6DoF) pose of furniture parts. Specifically, the author aims to create a 6DoF pose dataset containing multi - state assembly objects to support object state detection tasks, especially in complex assembly processes, being able to identify the current assembly step or state. ### Problem Background 1. **Applications of AR Technology** - AR technology is very popular in Industry 4.0 and intelligent manufacturing, especially showing significant advantages in maintenance, assembly, and repair tasks. - AR - assisted assembly tasks can significantly increase the assembly speed and reduce human errors. 2. **Limitations of Existing Datasets** - Existing object state detection methods rely on deep - learning models and require a large amount of data with 6DoF pose information for training. - There is a gap between synthetic data and real - world data, especially when dealing with human interactions and complex scenes, and it is difficult for synthetic data to simulate real occlusion and interaction situations. - Other existing datasets focus more on human action recognition or pose estimation of non - deformable rigid objects and are not suitable for complex assembly tasks. ### Main Contributions of the Paper - **IKEA Object State Dataset** - Provides a large - scale, multi - view IKEA furniture assembly dataset. - Contains 3D models, RGBD videos, the 6DoF pose of each part and its bounding box. - The dataset covers multiple IKEA furniture types and records each assembly step from the initial to the completion. - Uses RGBD cameras from 4 different views to record the assembly process synchronously, ensuring the comprehensiveness and accuracy of the data. ### Characteristics of the Dataset - **Hardware Setup** - Uses 4 Microsoft Kinect Azure cameras, placed in a circle with the lenses pointing down at the operating space. - The positions of the cameras are determined by calibration boards to ensure the consistency of multi - view data. - **Data Collection** - Records RGB and depth images and stores them in the Matroska file format. - Classifies the images into different assembly states manually. - **Data Annotation** - Manually annotates the first frame of each assembly state, and the remaining frames are tracked by the ICP algorithm. - Uses multi - view data to reconstruct the scene and obtain the point cloud, aligns the 3D model by the ICP algorithm, and finally optimizes all poses. ### Conclusions and Future Work - This dataset focuses on providing fully annotated frames, including the 6DoF pose of all object components, filling the gap in existing datasets for assembly tasks. - Future work will continue to improve data processing and benchmarking experiments to further enhance the quality and application range of the dataset. Through this dataset, researchers can better understand complex assembly processes and develop more accurate object state detection algorithms, thus promoting the development of AR - assisted assembly technology.