Multimodal Synthetic Dataset Balancing: a Framework for Realistic and Balanced Training Data Generation in Industrial Settings

Andreas Schwung,Diyar Altinses
DOI: https://doi.org/10.1109/IECON51785.2023.10311948
2023-10-16
Abstract:Deep networks have been successfully applied to industrial applications for clean unimodal data (e.g., sensors, images, or audio). Leveraging multimodal data is a common approach to enhance performance, guided by the principle that a larger quantity of data leads to improvement. However, performance may decline considerably if corruption in the data is present (e.g., noise, blur, failure). Although researchers have explored various data augmentation methods to improve the generalization capacity, these methods are not adapted for industrial settings. The primary distinction is that current augmentation methods are designed to enhance model generalization capabilities and not realistically simulate real-world industry scenarios. In this paper, we present industry-related augmentation methods for temporal and spatial data for multimodal fusion with deep neural networks. Our methods are specifically designed to encourage modality collaboration and reinforce generalization capability. The impact of the proposed data extension strategy to train multimodal fusion models is assessed on a synthetic dataset from an industrial UR5 robot with varying degrees of imbalance. In our study, we analyze different combinations of methods and evaluate their performance. Through these experiments, we are able to identify the challenges in multimodal fusion with deep learning models in an industrial setting.
Computer Science,Engineering
What problem does this paper attempt to address?