Ranking with Abstention

Anqi Mao,Mehryar Mohri,Yutao Zhong
2023-07-05
Abstract:We introduce a novel framework of ranking with abstention, where the learner can abstain from making prediction at some limited cost $c$. We present a extensive theoretical analysis of this framework including a series of $H$-consistency bounds for both the family of linear functions and that of neural networks with one hidden-layer. These theoretical guarantees are the state-of-the-art consistency guarantees in the literature, which are upper bounds on the target loss estimation error of a predictor in a hypothesis set $H$, expressed in terms of the surrogate loss estimation error of that predictor. We further argue that our proposed abstention methods are important when using common equicontinuous hypothesis sets in practice. We report the results of experiments illustrating the effectiveness of ranking with abstention.
Machine Learning
What problem does this paper attempt to address?
This paper attempts to address the problem of introducing a new framework in ranking tasks—Ranking with Abstention. In this framework, the learner can choose not to make a prediction in certain cases, incurring a certain cost \(c\). The main objectives of the paper include: 1. **Theoretical Analysis**: Extensive theoretical analysis of this new framework, particularly for linear function families and single hidden layer neural network families, proposing several H-consistency bounds. These theoretical guarantees are the latest results on consistency guarantees in the current literature. 2. **Importance of the Method**: Demonstrating the importance of the proposed abstention method when using common continuous hypothesis sets in practice. Especially when optimizing pairwise misranking loss or pairwise misranking loss with abstention, direct optimization is often infeasible, thus relying on surrogate loss functions. 3. **Experimental Validation**: Demonstrating the effectiveness of ranking with abstention through experiments. The experimental results show that in some cases, appropriate abstention can significantly reduce the target loss, especially for sample pairs that are close to each other. ### Main Contributions 1. **Theoretical Contributions**: - Proposed the ranking with abstention framework. - Provided H-consistency bounds for linear function families and single hidden layer neural network families. - Analyzed the role of the minimizability gap in these bounds. 2. **Practical Contributions**: - Experimentally validated the effectiveness of the ranking with abstention method. - Pointed out that direct optimization of pairwise misranking loss is ineffective for sample pairs that are close to each other, while the abstention method can significantly improve performance. ### Background and Motivation In many application scenarios, ranking is more appropriate than classification because the order of items is crucial. For example, in movie recommendation systems, users are more inclined to watch higher-ranked movies. However, direct optimization of pairwise misranking loss is often infeasible, thus relying on surrogate loss functions. The paper points out that for most hypothesis sets, direct optimization of pairwise misranking loss is infeasible, thus introducing the ranking with abstention framework to improve model performance and robustness. ### Experimental Results Experiments were conducted on the CIFAR-10 dataset using the ResNet-34 model. The experimental results show: - When the abstention threshold \(\gamma\) is small, abstention does not actually occur, and the abstention loss is consistent with the standard pairwise misranking loss. - As \(\gamma\) increases, more samples are abstained. When the cost \(c\) is small, appropriately choosing \(\gamma\) can significantly reduce the target loss. - When the cost \(c\) is large, the abstention loss usually increases with \(\gamma\) because the number of rejected samples increases. ### Conclusion The paper proposes a new ranking with abstention framework and validates its effectiveness through theoretical analysis and experiments. This framework performs particularly well when dealing with sample pairs that are close to each other, providing a new solution for ranking tasks.