MASSTAR: A Multi-Modal and Large-Scale Scene Dataset with a Versatile Toolchain for Surface Prediction and Completion

Guiyong Zheng,Jinqi Jiang,Chen Feng,Shaojie Shen,Boyu Zhou
2024-03-18
Abstract:Surface prediction and completion have been widely studied in various applications. Recently, research in surface completion has evolved from small objects to complex large-scale scenes. As a result, researchers have begun increasing the volume of data and leveraging a greater variety of data modalities including rendered RGB images, descriptive texts, depth images, etc, to enhance algorithm performance. However, existing datasets suffer from a deficiency in the amounts of scene-level models along with the corresponding multi-modal information. Therefore, a method to scale the datasets and generate multi-modal information in them efficiently is essential. To bridge this research gap, we propose MASSTAR: a Multi-modal lArge-scale Scene dataset with a verSatile Toolchain for surfAce pRediction and completion. We develop a versatile and efficient toolchain for processing the raw 3D data from the environments. It screens out a set of fine-grained scene models and generates the corresponding multi-modal data. Utilizing the toolchain, we then generate an example dataset composed of over a thousand scene-level models with partial real-world data added. We compare MASSTAR with the existing datasets, which validates its superiority: the ability to efficiently extract high-quality models from complex scenarios to expand the dataset. Additionally, several representative surface completion algorithms are benchmarked on MASSTAR, which reveals that existing algorithms can hardly deal with scene-level completion. We will release the source code of our toolchain and the dataset. For more details, please see our project page at
Robotics,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are some key challenges in the current surface prediction and completion research, especially the problem of insufficient datasets for large - scale scenes. Specifically: 1. **Insufficient dataset scale and modality diversity**: Existing datasets usually contain models of small - scale objects, such as chairs, tables, etc., and lack models of large - scale scenes (such as buildings, forests, etc.). In addition, these datasets also lack in modality diversity, mainly concentrating on a single or a few data types, such as 3D mesh models, RGB images, etc. 2. **Lack of real - world data**: Most existing datasets are mainly composed of synthetic models and lack real - world multi - modal data, which leads to poor performance of algorithms in practical applications because of the domain gap from simulation to reality. 3. **Limited dataset expansion ability**: Existing datasets are usually of a fixed scale and lack an effective toolchain to efficiently expand the dataset, thus limiting the development of research. To solve these problems, the paper proposes MASSTAR (Multi - Modal and Large - Scale Scene Dataset with a Versatile Toolchain for Surface Prediction and Completion), a multi - modal dataset containing large - scale scenes and its efficient toolchain. The main contributions of MASSTAR include: 1. **Developed a multi - functional and efficient toolchain**: This toolchain can screen out high - quality 3D mesh models from the real - world or synthetic environments and generate corresponding multi - modal information, such as images, description texts, point clouds, etc. 2. **Created a multi - modal large - scale scene dataset**: This dataset contains more than 1,000 scene - level 3D mesh models, some of which are from real - world data. 3. **Conducted benchmark tests on representative surface completion algorithms**: The results show that existing surface completion algorithms perform poorly when dealing with scene - level tasks, which highlights the importance of MASSTAR in promoting relevant research. 4. **Open - sourced the toolchain and dataset**: The authors plan to release the source code of the toolchain and sample datasets for researchers to further utilize and improve. Through these contributions, MASSTAR aims to promote research in the field of surface prediction and completion, especially in dealing with large - scale and complex scenes.