A Generalized Meta-loss function for regression and classification using privileged information

Amina Asif,Muhammad Dawood,Fayyaz ul Amir Afsar Minhas
DOI: https://doi.org/10.48550/arXiv.1811.06885
2019-03-25
Abstract:Learning using privileged information (LUPI) is a powerful heterogenous feature space machine learning framework that allows a machine learning model to learn from highly informative or privileged features which are available during training only to generate test predictions using input space features which are available both during training and testing. LUPI can significantly improve prediction performance in a variety of machine learning problems. However, existing large margin and neural network implementations of learning using privileged information are mostly designed for classification tasks. In this work, we have proposed a simple yet effective formulation that allows us to perform regression using privileged information through a custom loss function. Apart from regression, our formulation allows general application of LUPI to classification and other related problems as well. We have verified the correctness, applicability and effectiveness of our method on regression and classification problems over different synthetic and real-world problems. To test the usefulness of the proposed model in real-world problems, we have evaluated our method on the problem of protein binding affinity prediction. The proposed LUPI regression-based model has shown to outperform the current state-of-the-art predictor.
Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is that the existing large - margin models and neural networks based on Learning Using Privileged Information (LUPI) are mainly limited to classification tasks and cannot be directly applied to other types of machine - learning problems such as regression and ranking. The author proposes a new general meta - loss function, which enables LUPI to be extended to regression and other related problems, and verifies its effectiveness and applicability on a variety of synthetic and real - world data sets. ### Specific Problem Description 1. **Limitations of Existing Methods**: - Existing LUPI implementations mainly focus on classification tasks. - Since label softening is used as a regularization method to control the information flow between the privileged space and the input space, this limits the application of LUPI in non - classification tasks (such as regression). 2. **Goals of the New Method**: - Propose a simple but effective meta - loss function so that LUPI can be applied to regression tasks. - Expand the application range of LUPI so that it can handle multiple types of machine - learning problems such as classification, regression, and ranking. 3. **Practical Application Scenarios**: - The paper verifies the effectiveness of the proposed LUPI regression model through the practical problem of protein - binding - affinity prediction and shows that it is superior to the current state - of - the - art predictors. ### Mathematical Formula To achieve the above goals, the author proposes a new meta - loss function: \[ f_s=\arg\min_{f\in\mathcal{F}_s}\frac{1}{N}\sum_{i = 1}^{N}\left[(1 - e^{-T l(f_t(\mathbf{x}_i^*), y_i)})l(f(\mathbf{x}_i), y_i)+e^{-T l(f_t(\mathbf{x}_i^*), y_i)}l(f(\mathbf{x}_i), f_t(\mathbf{x}_i^*))\right] \] where: - \( f_t(\mathbf{x}_i^*) \) is the prediction of the teacher model in the privileged space. - \( f(\mathbf{x}_i) \) is the prediction of the student model in the input space. - \( l(\cdot,\cdot) \) is the loss function. - \( T \) is the temperature hyperparameter, which is used to control the degree of knowledge transfer from the teacher to the student. - \( e^{-T l(f_t(\mathbf{x}_i^*), y_i)} \) is a weighting factor, which is inversely proportional to the loss between the teacher output and the actual target. This formula adjusts the weighting factor so that when the prediction of the teacher model is more reliable, the student model will imitate the teacher model more; otherwise, it will rely more on the labels in the input space. ### Experimental Verification The author verifies the effectiveness of the proposed method through the following types of data sets: 1. **Synthetic Data Sets**: Including four different configurations of experiments to prove the effectiveness of the method in classification and regression tasks. 2. **MNIST Handwritten - Digit - Image Classification**: Further verifies the performance of the method in actual classification tasks. 3. **Protein - Binding - Affinity Prediction**: Verifies the performance of the method in actual regression tasks and shows its superiority over existing methods. In summary, this paper solves the application problem of LUPI in regression tasks by proposing a new meta - loss function and verifies its effectiveness and wide applicability through multiple experiments.