Find the Assembly Mistakes: Error Segmentation for Industrial Applications

Dan Lehman,Tim J. Schoonbeek,Shao-Hsuan Hung,Jacek Kustra,Peter H.N. de With,Fons van der Sommen
2024-08-23
Abstract:Recognizing errors in assembly and maintenance procedures is valuable for industrial applications, since it can increase worker efficiency and prevent unplanned down-time. Although assembly state recognition is gaining attention, none of the current works investigate assembly error localization. Therefore, we propose StateDiffNet, which localizes assembly errors based on detecting the differences between a (correct) intended assembly state and a test image from a similar viewpoint. StateDiffNet is trained on synthetically generated image pairs, providing full control over the type of meaningful change that should be detected. The proposed approach is the first to correctly localize assembly errors taken from real ego-centric video data for both states and error types that are never presented during training. Furthermore, the deployment of change detection to this industrial application provides valuable insights and considerations into the mechanisms of state-of-the-art change detection algorithms. The code and data generation pipeline are publicly available at: <a class="link-external link-https" href="https://timschoonbeek.github.io/error_seg" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to identify and locate assembly errors in industrial applications. Specifically, although existing research has made certain progress in assembly state identification, it has not yet delved into how to accurately locate assembly errors. Therefore, this paper proposes a new method - StateDiffNet, which aims to locate assembly errors by detecting the differences between the correctly expected assembly state and the test image. This method can not only handle assembly states that have never been seen in training, but also deal with more complex assembly configurations, and can work effectively in real first - person video data. The following are the main contributions of the paper: 1. **First realization of assembly error location for complex state differences**: Proposed the first network for studying error location under a wide range of complex state differences. 2. **Public data generation technology**: Provided a method for generating image pairs based on 3D models, allowing full control of normal changes between image pairs, and the generation pipeline has been made public. 3. **Insights into the mechanisms of the state - of - the - art change detection algorithms**: Provided valuable insights into the latest change detection algorithms applied in the industrial field. ### Formula Representation To ensure the correctness and readability of the formulas, the following are some example formulas that may be used (according to the text content): - **Cross - Attention Mechanism**: \[ \text{GCA}(f_1, f_2)=\sum_{i} w_i\cdot f_2(i) \] where \(w_i\) represents the similarity weight of the feature vector \(f_2(i)\) and \(f_1\). - **Linear Multi - Head Self - Attention**: \[ \text{MSA}(f)=\text{softmax}\left(\frac{QK^T}{\sqrt{d_k}}\right)V \] where \(Q, K, V\) are the query, key and value matrices respectively, and \(d_k\) is the dimension of the key. - **Local Cross - Attention**: \[ \text{LCA}(f_1, f_2)=\sum_{i\in R} w_i\cdot f_2(i) \] where \(R\) represents the set of indices within the local receptive field range. These formulas help to explain the key techniques and methods mentioned in the paper, ensuring that users can understand the underlying principles.