Performance Evaluation and Comparison of a New Regression Algorithm

Sabina Gooljar,Kris Manohar,Patrick Hosein
2023-06-15
Abstract:In recent years, Machine Learning algorithms, in particular supervised learning techniques, have been shown to be very effective in solving regression problems. We compare the performance of a newly proposed regression algorithm against four conventional machine learning algorithms namely, Decision Trees, Random Forest, k-Nearest Neighbours and XG Boost. The proposed algorithm was presented in detail in a previous paper but detailed comparisons were not included. We do an in-depth comparison, using the Mean Absolute Error (MAE) as the performance metric, on a diverse set of datasets to illustrate the great potential and robustness of the proposed approach. The reader is free to replicate our results since we have provided the source code in a GitHub repository while the datasets are publicly available.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to evaluate and compare the performance of a newly proposed regression algorithm with four traditional machine - learning algorithms (decision tree, random forest, k - nearest neighbor, and XGBoost) in regression tasks. Specifically, the author hopes to conduct an in - depth comparison on multiple datasets by using the mean absolute error (MAE) as a performance indicator, in order to show the potential and robustness of the new algorithm in solving complex regression tasks. The newly proposed algorithm predicts the target value of a test sample based on the distance metric (Euclidean distance) and the weighted average of the target values of the training data points. The weight is inversely proportional to the distance between the test point and the training point, and this distance is raised to the power of a parameter \(\kappa\). In this way, the new algorithm aims to improve the prediction accuracy and robustness, especially in the case of small datasets or insufficient feature - class samples. The main contribution of the paper lies in providing a detailed performance comparison between this new algorithm and existing popular algorithms, thus providing a strong candidate method for future research and applications.