Effects of Random Sampling on SVM Hyper-parameter Tuning

Tomáš Horváth,Rafael G. Mantovani,André C. P. L. F. de Carvalho
DOI: https://doi.org/10.1007/978-3-319-53480-0_27
2017-01-01
Abstract:Hyper-parameter tuning is one of the crucial steps in the successful application of machine learning algorithms to real data. In general, the tuning process is modeled as an optimization problem for which several methods have been proposed. For complex algorithms, the evaluation of a hyper-parameter configuration is expensive and their runtime is speed up through data sampling. In this paper, the effect of sample sizes to the results of hyper-parameter tuning process is investigated. Hyper-parameters of Support Vector Machines are tuned on samples of different sizes generated from a dataset. Hausdorff distance is proposed for computing the differences between the results of hyper-parameter tuning on two samples of different size. 100 real-world datasets and two tuning methods (Random Search and Particle Swarm Optimization) are used in the experiments revealing some interesting relations between sample sizes and results of hyper-parameter tuning which open some promising directions for future investigation in this direction.
What problem does this paper attempt to address?