Principal Component Analysis With Fuzzy Elastic Net for Feature Selection
Yunlong Gao,Qinting Wu,Zhenghong Xu,Chao Cao,Jinyan Pan,Guifang Shao,Feiping Nie,Qingyuan Zhu
DOI: https://doi.org/10.1109/tfuzz.2024.3466926
IF: 12.253
2024-12-04
IEEE Transactions on Fuzzy Systems
Abstract:Feature selection serves as a fundamental technique in machine learning and data analysis, playing a crucial role in extracting valuable features from large-scale and high-dimensional datasets that may contain irrelevant features. To enhance the performance of feature selection, regularizers like -norm or -norm are commonly utilized to encourage sparsity. Nonetheless, these traditional regularization techniques encounter certain challenges. When correlations exist among features, the sparsity-driven regularization can unfairly diminish weights of correlated features to zero, thus ignoring the feature correlations and lacking group sparsity properties. While a straightforward combination of -norm and -norm can uncover feature correlations, it lacks adaptability and effectively balancing sparsity and correlation. To address these challenges, we introduce a novel matrix-based regularization term, called a fuzzy elastic net, in the unsupervised feature selection model. Our model is founded on principal component analysis, a well-established dimensionality reduction technique adept at finding subspaces that retain most information from raw data. The model is enhanced by a fuzzy elastic net, which promotes group or sparsity properties through adaptive parameter tuning. The new regularization term introduces a flexible fuzzy weighted scheme combining the -norm and -norm ( ). This approach allows adaptive adjustment based on data characteristics, offering a tunable balance between selecting discriminative features and identifying correlated ones. Consequently, this regularization term equips the model to handle diverse data analysis tasks flexibly, thereby enhancing adaptability and generalization performance. Furthermore, we propose an efficient optimization strategy to solve this model. Extensive experiments conducted on UCI datasets and real-world datasets demonstrate the effectiveness and efficiency of our proposed method.
computer science, artificial intelligence,engineering, electrical & electronic