Ensemble Learning with Extremely Randomized k-Nearest Neighbors for Accurate and Efficient Classification
Saber, Abid,Abbas, Moncef,Fergani, Belkacem
DOI: https://doi.org/10.1007/s00354-024-00286-x
2024-12-04
New Generation Computing
Abstract:Ensemble learning has emerged as a potent methodology for enhancing the accuracy, stability, and robustness of machine learning models. In this paper, we introduce a novel ensemble learning approach that primarily uses nearest neighbor modeling (kNN) and genetic algorithms (GA) to build a highly efficient classification model, which we call "Extra-kNNs". This model operates across three distinct layers. First, we employ strong randomization of both bootstrap sampling and kNN hyperparameters to build a diverse collection of kNNs in the input layer. Next, we improve their efficiency using a clustering-based transformation layer. Finally, the output layer optimizes the ensemble's performance using a GA algorithm to build an effective ensemble model. The proposed approach addresses the limitations of kNN, including sensitivity to noise, challenges in high-dimensional data, and its relative ineffectiveness in ensemble methods compared to decision trees. We tested our method rigorously on 15 real-world datasets, comparing its performance to several individual and ensemble models. Our empirical findings demonstrate that the suggested model achieves higher accuracy and classification efficiency than state-of-the-art algorithms.
computer science, theory & methods, hardware & architecture