Abstract:Due to recent advances on hardware and software technologies, industrial automation has been significantly improved in the past few decades. For random bin picking applications, it is a future trend to use machine vision based approaches to estimate the 3D poses of workpieces. In this work, we present a robotic grasping system with multi-view depth image acquisition. First, RANSAC and an outlier filter are adopted for noise removal and multi-object segmentation. A voting scheme is then used for preliminary pose computation, followed by the ICP algorithm to derive a more precise target orientation. A model-based registration approach using a genetic algorithm with parameter minimization is proposed for 6-DOF pose estimation. Finally, the grasping efficiency is increased by disturbance detection, which reduces the number of 3D data scanning for multiple operations. The experiments are carried out in the real scene environment, and the performance evaluation has demonstrated the feasibility of the proposed technique.
engineering, electrical & electronic,instruments & instrumentation,physics, applied
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to improve the grasping ability of robots in complex and unstructured environments, especially for objects with irregular shapes, texture - less surfaces or no color information. Specifically, the article focuses on the following aspects:
1. **Accuracy of 3D pose estimation**: In applications such as random bin picking, accurately estimating the 3D pose of workpieces is the key to achieving efficient robot grasping. Traditional methods perform poorly when dealing with noisy point - cloud data and occlusion problems, so a more robust method is required to improve the accuracy of pose estimation.
2. **Multi - view image acquisition and fusion**: To overcome the problems of insufficient 3D point - cloud data or low signal - to - noise ratio obtained by a single RGB - D camera, this research proposes to use multiple RGB - D cameras to acquire images from different perspectives and perform data fusion to obtain more complete scene information.
3. **Model - based pose estimation**: By aligning the CAD model with the actually captured 3D point - cloud data, this method aims to achieve 6 - degree - of - freedom (6 - DOF) pose estimation. To this end, the author introduces a new strategy that combines the genetic algorithm and the Hough voting mechanism to improve the efficiency and accuracy of pose estimation.
4. **Improvement of grasping efficiency**: By perturbation detection, the need for multiple scans is reduced, thereby improving the efficiency of grasping operations. In addition, the system also considers the visibility and operability of the target object to ensure that the robot can select the best grasping position.
In summary, this paper aims to develop a vision - guided robot grasping system that can accurately identify and grasp various types of objects in an industrial automation environment, especially in the absence of obvious features. By improving 3D pose estimation techniques and multi - view data fusion, this system can significantly improve the success rate and efficiency of grasping tasks.
### Formula Explanation
Some of the key formulas involved in the paper are as follows:
- **Principal Component Analysis (PCA) for calculating surface normals**:
\[
C=\frac{1}{k}\sum_{i = 1}^{k}(p_i-\bar{p})(p_i-\bar{p})^\top
\]
\[
C\cdot\vec{v}_j=\lambda_j\cdot\vec{v}_j,\quad j\in\{0,1,2\}
\]
where \(k\) is the number of points in the neighborhood, \(\bar{p}\) is the centroid of the nearest neighbor points, and \(\lambda_j\) and \(\vec{v}_j\) are the \(j\)-th eigenvalue and eigenvector of the covariance matrix \(C\), respectively.
- **Curvature calculation**:
\[
\sigma=\frac{\lambda_0}{\lambda_0+\lambda_1+\lambda_2}
\]
- **Genetic algorithm fitness function**:
\[
E=\sum\|Rp + T - q\|^2
\]
where \(R = R_z(\gamma)R_y(\beta)R_x(\alpha)\), \(T=[M_x,M_y,M_z]^\top\), \(p\) is the scene model, and \(q\) is the CAD model.
These formulas are used to describe the specific steps of data processing, feature extraction, and pose estimation, ensuring the rigor and effectiveness of the algorithm.