Object tracking algorithm via weighted margin structured support vector machine
Shaojie Jiang,Jifeng Ning,Yunsong Li
DOI: https://doi.org/10.11834/jig.160651
2017-01-01
Journal of Image and Graphics
Abstract:Objective The distraction of various factors during the object-tracking process makes tracking states unpredictable.For example,when a tracked object is occluded,ordinary trackers are likely to experience model drift problems.This problem occurs when the target is temporarily influenced by other objects or illumination conditions.The training samples extracted from these frames are not as reliable as those extracted from normal scenes.If these samples are fed into the model-updating process,then the tracking model deviates due to these unreliable samples,and the decision boundary between target samples and background samples becomes blurry.Consequently,the discriminability of the tracking model is degraded,which in turn,causes the model drift problem.If we can tag the training samples with different confidence levels and use the tagged samples to train the tracking model,then the tracker is expected to be tolerant of unexpected scenes.Method We base our work on the weighted margin support vector machine (WMSVM) classifier and the recently proposed structured support vector machine (SSVM) tracking algorithm.A weighted margin SSVM tracking model (WMSSVM) is proposed,which can consider the confidence values of training samples,thereby enabling the SSVM tracking algorithm to adapt to different scenes.First,sample confidence is estimated according to the overlap rate of the score range of the tracker and the predicted target positions of the current and last frames.This method can reliably reflect the confidence variation status through the entire tracking process.Second,a WMSSVM tracking model is built to train the samples of different confidence levels,such that these samples can have different influences on the decision boundary of the trained tracker.Accordingly,the WMSSVM model can adaptively adjust to different scenes encountered during tracking.Finally,we demonstrate that the tracking model can be solved by using a dual coordinate descent algorithm.Result We evaluate our tracker on an object tracking benchmark,namely,OTB-100,which contains 100 challenging video sequences.Compared with the base tracker dual linear structured support vector machine (DLSSVM),the proposed WMSSVM presents an increase of 1% in one-pass evaluation (OPE) precision and 2% in OPE overlap.It indicates that the proposed WMSSVM model performs better than the original SSVM model used by scale-DLSSVM in OPE precision and OPE success.We then determine that the tracking algorithm that uses the WMSSVM model is upgraded by 3.4% in background clutter scene,3.4% in deformation scene,1.4% in fast-motion scene,3.4% in motion blur scene,3.6% in occlusion scene,and 4.3% in out-of-view scene.The aforementioned scenes are typical difficult scenes that exist in a tracking area.The proposed tracker performs worse than the original scale-DLSSVM tracker in a sequence called Shaking (its OPE precision score is reduced from 0.459 to 0.056).The reason for this result is the drastic illumination variation that occurs at approximately the 60th frame,which lasts for approximately nine frames,and influences nearly the entire camera scene.This condition is difficult for the tracker to deal with.The proposed WMSSVM tracker also shows promising results compared with other state-of-the-art tracking algorithms.Although our tracker scores lower than the hierarchical convolutional features for visual tracking (HCFT) in OPE precision,two observations are made.First,the OPE precision evaluation metric utilizes a fixed distance of 20 pixels between the predicted target position and the ground truth position,whereas the OPE success metric uses the area under curve as the metric score.Thus,the better performance of WMSSVM in OPE success is more convincing.Second,HCFT uses a high-dimensional feature from an off-line neural network trained from a large object detection dataset,which is time-consuming.By contrast,the proposed WMSSVM uses only an online-extracted features consisting of Lab color feature and local rank transformation feature.Conclusion In this study,the confidence of samples is first incorporated into the SSVM tracking model,and then the WMSSVM tracking model and its optimization method are proposed.The effectiveness of the proposed method is validated on a tracking benchmark dataset with 100 sequences.The proposed method can track objects in complicated scenes,and it exhibits remarkable performance in videos with background clutter,deformation,occlusion,motion blur,and fast motion.