RoBoSS: A Robust, Bounded, Sparse, and Smooth Loss Function for Supervised Learning

Mushir Akhtar,M. Tanveer,Mohd. Arshad
DOI: https://doi.org/10.1109/TPAMI.2024.3465535
2024-09-21
Abstract:In the domain of machine learning, the significance of the loss function is paramount, especially in supervised learning tasks. It serves as a fundamental pillar that profoundly influences the behavior and efficacy of supervised learning algorithms. Traditional loss functions, though widely used, often struggle to handle outlier-prone and high-dimensional data, resulting in suboptimal outcomes and slow convergence during training. In this paper, we address the aforementioned constraints by proposing a novel robust, bounded, sparse, and smooth (RoBoSS) loss function for supervised learning. Further, we incorporate the RoBoSS loss within the framework of support vector machine (SVM) and introduce a new robust algorithm named $\mathcal{L}_{RoBoSS}$-SVM. For the theoretical analysis, the classification-calibrated property and generalization ability are also presented. These investigations are crucial for gaining deeper insights into the robustness of the RoBoSS loss function in classification problems and its potential to generalize well to unseen data. To validate the potency of the proposed $\mathcal{L}_{RoBoSS}$-SVM, we assess it on $88$ benchmark datasets from KEEL and UCI repositories. Further, to rigorously evaluate its performance in challenging scenarios, we conducted an assessment using datasets intentionally infused with outliers and label noise. Additionally, to exemplify the effectiveness of $\mathcal{L}_{RoBoSS}$-SVM within the biomedical domain, we evaluated it on two medical datasets: the electroencephalogram (EEG) signal dataset and the breast cancer (BreaKHis) dataset. The numerical results substantiate the superiority of the proposed $\mathcal{L}_{RoBoSS}$-SVM model, both in terms of its remarkable generalization performance and its efficiency in training time.
Machine Learning
What problem does this paper attempt to address?
The paper aims to address the following issues: In supervised learning tasks, traditional loss functions struggle to handle situations with outliers and high-dimensional data, leading to suboptimal results and slow convergence during the training process. To address these issues, the paper proposes a new loss function—RoBoSS (Robust, Bounded, Sparse, and Smooth) loss function, and applies it within the Support Vector Machine (SVM) framework, introducing a new algorithm called LRoBoSS-SVM. Specifically, the RoBoSS loss function has the following characteristics: 1. **Robustness**: Better handles outliers. 2. **Boundedness**: Limits the influence of extreme values. 3. **Sparsity**: Utilizes only misclassified samples or samples close to the decision boundary for training. 4. **Smoothness**: Makes the optimization process more stable and efficient. Through these characteristics, the RoBoSS loss function aims to improve the stability and generalization ability of supervised learning models. Additionally, the paper provides a theoretical analysis of the newly proposed loss function, verifying its classification calibration properties and generalization ability. Experimental evaluations on multiple benchmark datasets demonstrate its superior performance. Particularly in biomedical applications (such as EEG signals and breast cancer datasets), LRoBoSS-SVM shows significant advantages.