Surrogate Sample-Assisted Particle Swarm Optimization for Feature Selection on High-Dimensional Data

Xianfang Song,Yong Zhang,Dunwei Gong,Hui Liu,Wanqiu Zhang
DOI: https://doi.org/10.1109/tevc.2022.3175226
IF: 16.497
2022-01-01
IEEE Transactions on Evolutionary Computation
Abstract:With the increase of the number of features and the sample size, existing feature selection (FS) methods based on evolutionary optimization still face challenges such as the “curse of dimensionality” and the high computational cost. In view of this, dividing or clustering the sample and feature spaces at the same time, this article proposes a hybrid FS algorithm using surrogate sample-assisted particle swarm optimization (SS-PSO). First, a nonrepetitive uniform sampling strategy is employed to divide the whole sample set into several small-size sample subsets. Regarding each sample subset as a surrogate unit, next, a collaborative feature clustering mechanism is proposed to divide the feature space, with the purpose of reducing both the computational cost of clustering feature and the search space of PSO. Following that, an ensemble surrogate-assisted integer PSO is proposed. To ensure the prediction accuracy of ensemble surrogate when evaluating particles, an ensemble surrogate construction and management strategy is designed. Since the whole sample set is replaced by a small number of surrogate units, SS-PSO significantly reduces the cost of evaluating particles in PSO. Finally, the proposed algorithm is applied to some typical datasets, and compared with six typical evolutionary FS algorithms, as well as its several variant algorithms. The experimental results show that SS-PSO can obtain good feature subsets at the smallest computational cost on most of datasets. All verify that SS-PSO is a highly competitive method for high-dimensional FS.
computer science, artificial intelligence, theory & methods
What problem does this paper attempt to address?