Real-time camera pose tracking with locating image patching scales and regions

Jinghua Miao,Yankui Sun
DOI: https://doi.org/10.11834/jig.160612
2017-01-01
Abstract:Objective In a conventional augmented reality system,the multi-scale image representations of a template image are constructed first.Feature key points at each scale are extracted and put together as a template feature set,which is used to match with the feature points extracted from the camera images.The number of feature points of the template image would become large when the number of scales in the template image representations is large.Nevertheless,camera images only correspond to images within a scale range similar to the scale of the camera image,and they probably overlapped with these images in partial regions.This result means that an ample amount of useless computation exists in conventional feature matching algorithms,thereby simultaneously lowering the image matching speed and decreasing registration accuracy.This paper proposes an effective method to locate image matching scales and regions in camera pose tracing and solve the preceding problem mentioned.Using local feature patching between current camera image features and the corresponding image scales and features of template image pyramid of regions achieves real-time computation of camera pose by feature matching pairs to solve feature matching accuracy and efficiency problem of the traditional three-dimensional tracing method.Method In the preprocessing stage,scale-space layers of a template image are constructed first.Concretely,an image is obtained by down-sampling the original image by a factor of 1.5,and it is sequenced as the second layer.On the condition that image resolution at the maximum layer is only less than that of the screen image specified,the other layers are formed by progressively half-sampling the original image and the second layer image and putting the two sequences alternately.Secondly,the key frame structure for each layer image is built.Specifically,each layer image is partitioned into the same rectangular regions,which could be overlapped when necessary.The size of the rectangular region is selected similar to that of the layer image at the maximum scale in scale-space layers.In each region,feature points are extracted and binary descriptors are generated by using the oriented FAST and rotated BRIEF algorithm,putting every rectangular position,sub-image,and feature points within it together to form a key frame structure.By this way,the feature descriptors of the image pyramid are managed according to scales and regions.In the real-time tracking stage,the scale range for any camera image within the image pyramid is located first.The covered image regions within this scale range are found using defined overlapping degree rules,thereby decreasing the scope of feature matching between current camera image features and template image pyramid and improving feature matching accuracy and efficiency by using local feature matching.1) In locating scale range,a camera image,which is obtained in a distance to a template image,essentially corresponds to a scale range in the image pyramid of the template image and overlaps with some image regions in the scale range.This paper suggests a method for locating the scale range.First,this method predicts current camera pose in two ways:using the last frame camera pose and predicting the pose by Kalman filtering;four vertices of the original image are projected on the screen image with the evaluated camera pose;finally,the projection area size is obtained and used to compare with the layer image sizes in the image pyramid to determine the scale range.2) In calculating the degree of region overlapping,we project all their key frame regions in layer images within the scale range onto the screen image with the evaluated camera pose to calculate the areas of the overlapped regions;the region overlapping degree is calculated through our method.3) In local feature extraction and matching,a number of key frames with a large region overlapping degrees are obtained from the camera image by using the last frame camera pose as the evaluation;other key frames are obtained similarly by using pose evaluation from Kalman filtering.We consider the union of the two key frame sets and match all their feature points with those extracted from the camera image through the ORB algorithm and compute the camera pose by using some matching pairs.Result The new algorithm is implemented and run on a smartphone,tested on an open image database (Stanford mobile visual search dataset) with different resolution images and on other template images.This new algorithm is compared with four advanced algorithms,namely,fast locating of image scale and area,ORB,FREAK (fast retina keypoint),and BRISK (binary robust invariant scalable keypoints).In experiments,videos are recorded and used for all testing template images,where camera translations,rotations,and scaling-related template images are included.The optimal parameters of the ORB,FREAK,and BRISK algorithms are selected by analysis and tests,and the registration error and running frame rates are tested before and after,respectively,integrating our feature matching algorithm with the optical flow algorithm.Experimental results show that our new algorithm is robust and has high registration accuracy with approximately one pixel and has a real-time 3D tracking rate of 20-30 frames per second.Conclusion The algorithm can locate an image scale and region much better than before.The feature matching accuracy and speed between the current camera and template images increase obviously compared with several classic algorithms,especially when the resolution of the image is high.This algorithm can be used to track the natural image on a mobile platform.
What problem does this paper attempt to address?