CDFSIP Feature Selection Algorithm Based on ADA-DPC

Yuefeng He,Zhaozhong Wu,Juanying Xie
2023-01-01
Abstract:To address the limitations of the feature selection algorithm DFSIP (Discernibility matrix based fully adaptive 2D Feature Selection based on Information gain and Pearson correlation coefficient), which utilizes a global threshold for each feature in a dataset, a new feature selection algorithm called CDFSIP (Clustering based Fully adaptive 2D DFSIP) is proposed in this study. The CDFSIP algorithm relies on the adaptive density peak clustering algorithm ADA-DPC (An Adaptive Clustering Algorithm by Finding Density Peaks) to determine the appropriate neighborhood parameter for calculating the neighborhood information of each feature, thereby establishing an adaptive threshold for each feature in CDFSIP. The algorithm evaluates the feature distribution using the Gini coefficient and decides whether to use traditional domain division or neighborhood division through clustering. This adaptive approach enables the selection of optimal neighborhood parameter for each feature, resulting in a feature subset that exhibits superior classification capability in complex scenarios. The selected feature subset is then evaluated using the K-ELM classifier. Experimental tests conducted on benchmark datasets demonstrate the effectiveness of the CDFSIP algorithm in identifying feature subsets with strong recognition capabilities. Additionally, the classifier built on the selected feature subset demonstrates excellent classification performance compared to classifiers based on feature subsets detected by other feature selection algorithms, including DFSIP, FSIP, mRMR, LLE score, DRJMIM, AVC, and AMID.
What problem does this paper attempt to address?