Global Model Selection for Semi-Supervised Support Vector Machine via Solution Paths

Yajing Fan,Shuyang Yu,Bin Gu,Ziran Xiong,Zhou Zhai,Heng Huang,Yi Chang
DOI: https://doi.org/10.1109/tnnls.2024.3354978
IF: 14.255
2024-01-01
IEEE Transactions on Neural Networks and Learning Systems
Abstract:Semi-supervised support vector machine (S <sup>3</sup> VM) is important because it can use plentiful unlabeled data to improve the generalization accuracy of traditional SVMs. In order to achieve good performance, it is necessary for S <sup>3</sup> VM to take some effective measures to select hyperparameters. However, model selection for semi-supervised models is still a key open problem. Existing methods for semi-supervised models to search for the optimal parameter values are usually computationally demanding, especially those ones with grid search. To address this challenging problem, in this article, we first propose solution paths of S <sup>3</sup> VM (SPS <sup>3</sup> VM), which can track the solutions of the nonconvex S <sup>3</sup> VM with respect to the hyperparameters. Specifically, we apply incremental and decremental learning methods to update the solution and let it satisfy the Karush-Kuhn-Tucker (KKT) conditions. Based on the SPS <sup>3</sup> VM and the piecewise linearity of model function, we can find the model with the minimum cross-validation (CV) error for the entire range of candidate hyperparameters by computing the error path of S <sup>3</sup> VM. Our SPS <sup>3</sup> VM is the first solution path algorithm for nonconvex optimization problem of semi-supervised learning models. We also provide the finite convergence analysis and computational complexity of SPS <sup>3</sup> VM. Experimental results on a variety of benchmark datasets not only verify that our SPS <sup>3</sup> VM can globally search the hyperparameters (regularization and ramp loss parameters) but also show a huge reduction of computational time while retaining similar or slightly better generalization performance compared with the grid search approach.
computer science, artificial intelligence, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?