Invariant feature extraction and recognition for shapes
Haoran Xu,Jianyu Yang,Weiguo Huang,Li Shang
DOI: https://doi.org/10.11834/jig.170080
2017-01-01
Journal of Image and Graphics
Abstract:Objective The shape of object contour is an important indication for image retrieval and object recognition;it is usually represented by a binary image.Although the binary images of objects have few features,such as color or texture,human can still recognize them only by shapes.By contrast,the shapes of objects cannot be recognized by computer directly.In recent years,shape retrieval and recognition have been fundamental topics in computer vision and have been widely studied for various applications,such as character recognition,biomedical image analysis,hand gesture recognition,robot navigation,and human gait recognition.To extract salient features for the representative characterization of a shape,many shape descriptors have been proposed and have reported promising results.However,the influences of viewpoint variations and nonlinear deformations,such as significant intra-class differences,geometric transformations,and partial occlusions,are challenging problems that decrease the accuracy of shape matching and recognition.Most traditional shape descriptors utilize local or global information of shapes,which cannot solve the problems on shape deformations and intra-class variations simultaneously.The local descriptors can represent the local shape features effectively but do not consider the global shape structure.By contrast,the global descriptors are robust to local noise and deformations but ignore the detailed local shape features and cannot deal with occlusion.A novel invariant multi-scale descriptor with different types of invariant features is proposed to capture the local and semi-global features of shapes.Method The invariant multiscale descriptor is defined with five types of invariants,which capture shape features in five forms,including area,changing rate of area,arc length,changing rate of arc length,and central distance.These five types of invariants are normalized between 0 and 1 to capture the inconsistent variations adaptively within one shape and avoid scale transformation.The proposed multi-scale descriptor calculates invariants in multiple scales to combine the advantages of local and global descriptors.This method uses small scales to capture shape details and large scales to represent semi-global features,thereby obtaining rich characterizations of shapes.Considering that different numbers of sample points are usually in two contours for shape matching,dynamic time warping (DTW) algorithm is employed to determine the best correspondence between two sequences of contour points and offer the similarity measure of two different shapes based on their invariant multi-scale descriptors.Result The invariance and robustness of the proposed invariant multi-scale descriptor is evaluated through multiple comparative experiments.In the particular experiments,the five types of invariants of shapes with different influences are plotted,and their Euclidean distances are calculated to show the similarity between different shapes from the same class.The experimental results validate that the proposed descriptor is robust to rotation,scale transformation,partial occlusion,intra-class variatious,articulated variations,and noise.Moreover,the effectiveness in shape matching of the proposed method is evaluated in the experiments of shape retrieval on several benchmark datasets.The bull's eye score is used as the rule of judgment in the experiments.In comparison with other methods,the proposed method has the highest accuracy in all four shape datasets,that is,91.79% in MPEG-7 dataset,89.75% in the articulated dataset,95.27% in Kimia's 99 dataset,and 91.33% in Kimia's 216 dataset.At the same time,the average time consumed by the shape recognition in MPEG-7 dataset with the proposed method is 65 ms,which is better than the other recognition methods.The state-of-the-art results demonstrate that the proposed method is effective for shape recognition and retrieval tasks.Conclusion A novel invariant multi-scale descriptor is proposed for shape representation,matching,and recognition.In the proposed descriptor,five types of invariants are utilized to capture shape features from different aspects.These invariants are calculated in several scales,assuring that the local and global information of shapes can be represented simultaneously.The DTW algorithm is employed to determine the best correspondence between two sequences of contour points based on their invariant multiscale descriptors,thereby identifying the appropriate similarity measure for different shapes.The experimental results validate that the proposed descriptor is invariant to rotation,scaling,partial occlusion,intra-class variations,and articulated deformations.The plots of different invariant functions show that the local and semi-global features are both captured by the invariants in different scales.The proposed DTW algorithm can appropriately measure the similarities among different shapes,regardless of the number of their contour points.The retrieval experiments on the benchmark datasets verify that the proposed method has a comparable advantage on retrieval accuracy and efficiency,which are better than the other popular shape recognition methods.The proposed method in this study is suitable for shape recognition and retrieval tasks in complex environments.This method cannot use the prior knowledge of large datasets to accelerate the computation speed and improve the accuracy of shape retrieval and recognition in shape datasets.Therefore,for future studies,the metric learning method would be introduced into shape matching for the better performance of the proposed method.