Cotton positioning technique based on binocular vision with implementation of scale-invariant feature transform algorithm

zhu rongjie,zhu yinghui,wang ling,lu wei,luo hui,zhang zhichuan
DOI: https://doi.org/10.11975/j.issn.1002-6819.2016.06.025
2016-01-01
Abstract:Rapid development of mechanization in agriculture has made it possible to lower the manual labor hour and increase efficiency at the same time. In order to provide the mechanical arm of the cotton picking robot with the needed movement locus parameters, a cotton distance measuring device based on binocular vision with a full implementation of SIFT(scale-invariant feature transform) algorithm was introduced, which realized the positioning of all 11 pieces of cotton planted.Under indoor environment, the cotton images were captured with the control of projector flashlight and the unneeded backgrounds were segmented. Turn the RGB images into gray scale and enhance the gray value to make the cotton more obvious, and after sharpening the edges, the pretreatments of cotton images were finished. Blur the images through Gaussian filter with 8 different scales, calculate the Do G(difference of Gaussian) of Gaussian images and acquire the extrema of 26 neighboring pixels within neighboring scales, and thus SIFT key points were detected, all these key points were invariant to rotation, translation, zoom and affine, which was suitable for the match of cotton images. Calculate the gray gradient modulus value of the 4×4 seed points in 8 directions within the key point neighborhood, and the 128-dimensional SIFT descriptor of each key point was acquired. As to all the SIFT key points in the right image, select the dimension with the maximum variance, and calculate the median value of this dimension, find its corresponding key point and split the other key points according to the median value, repeat this step and the binary tree was built. As to every SIFT key point in the left image,search its potential matches(probably more than one) in the binary tree of the right image until its leaf node was found; save the brother nodes found along the path, establish priority sequence with BBF(best bin first) and expand from the brother nodes to their leaves, find the nearest and second nearest neighbors according to the similarity degree of the 128-dimensional key points between the potential matches until the sequence was empty or the algorithm exceeded its 200 times constraint.Thus 172 pairs of rough cotton matches of key points in 2 images were acquired, but there was still a possibility that there might be wrong matches among rough matches. In order to eliminate the wrong matches, estimate fundamental matrix with RANSAC(random sample consensus) algorithm and recover epipolar geometry constraint; during each sampling, use 8-point algorithm to compute an initial fundamental matrix, calculate the distance from every point to its corresponding epipolar line and count the ones within the threshold as inliers. Repeat this step and choose the fundamental matrix with the most inliers or the least error(in case there were more than one fundamental matrix with the same inlier number) as the final output fundamental matrix, and the corresponding inliers were called refined cotton matches. Using the RANSAC algorithm we got151 pairs of refined cotton matches, and there were no wrong matches in the refined matches, which helped make the results of cotton three-dimensional(3D) reconstruction more accurate. Calibrate the camera to get its intrinsic matrix, and then get essential matrix according to fundamental matrix and intrinsic matrix through transformation. Split essential matrix and the camera′ s external rotation matrix and translation vector were acquired. To this point, inputs needed for cotton 3D reconstruction were all ready, and they were 151 pairs of refined matches of cotton, intrinsic matrix, external rotation matrix and translation vector. Put these inputs into the equations and 2D cotton image coordinates could be transformed into 3D coordinates, and the 3D reconstruction of cotton point cloud on the plant was realized. At last the 3D coordinate values of every cotton were obtained and their centroid coordinate values were calculated. Result showed that all 11 pieces of cotton were all successfully 3D positioned, with an average error of 0.039 3m compared with manual measurement, which proves the calculated data are valid and this binocular vision system is reliable enough for practical application.
What problem does this paper attempt to address?