Optimal Feature Subset Determination for High-Dimensional Datasets in Manufacturing Processes

Yuhang Fang,Xiang Li,Wen Feng Lu
DOI: https://doi.org/10.1109/ICIAI.2019.8850820
2019-01-01
Abstract:Selecting the most relevant feature subset with good predictive performances and computational speed for high-dimensional datasets is usually challenging in manufacturing processes for product quality control. In this paper, an optimal feature subset determination approach for high-dimensional datasets is proposed. The algorithm starts with ranking the importance of individual features using the normalized mutual information concepts. Then, through optimizing a new evaluation metric, an initial relevant feature subset is obtained. Finally, redundant features are eliminated to help obtain the final selection results. The proposed method was successfully used in solving a real-world high-dimensional feature selection problem in the semiconductor industry. Comparisons were made with four other representative feature selection algorithms, including Relief-F, mRMR, FCBF, and IWFS in processing a number of datasets from different applications. It is demonstrated that the proposed method can automatically determine an optimal feature subset to achieve good average predictive accuracy with less computational resources.
What problem does this paper attempt to address?