GAEFS: Self-supervised Graph Auto-encoder enhanced Feature Selection

Jun Tan,Ning Gui,Zhifeng Qiu
DOI: https://doi.org/10.1016/j.knosys.2024.111523
IF: 8.139
2024-04-01
Knowledge-Based Systems
Abstract:Feature selection is an essential process in machine learning in selecting the features that contribute the most to the prediction target to build more interpretable and robust models. However, most feature selection algorithms must exploit potential complex correlations among features and samples with very limited labeled samples and are sensitive to noise. To this end, we innovatively propose the Graph Auto-encoder enhanced Feature Selection (named GAEFS) which uses graph representation to discover and express non-Euclidean relations among features and samples by translating unlabeled “flat” tabular data into a similarity graph. Through a self-supervised missing data imputation task, rich information on the graph is distilled and redundant features and noise are removed. Guided by the condensed graph representation, an anti-noise batch-attention feature selection mechanism is used to generate feature weights according to feature selection patterns. The results show that GAEFS achieves significant performance edges in most classification datasets compared to thirteen state-of-the-art baselines. GAEFS also shows excellent robustness under different noise disturbances. Furthermore, this design allows one-shot feature selections with 1/50 ∼ 1/10 labeled data to achieve similar or better performance than other supervised solutions.
computer science, artificial intelligence
What problem does this paper attempt to address?