Partial-Label Regression

Xin Cheng,Deng-Bao Wang,Lei Feng,Min-Ling Zhang,Bo An
2023-06-15
Abstract:Partial-label learning is a popular weakly supervised learning setting that allows each training example to be annotated with a set of candidate labels. Previous studies on partial-label learning only focused on the classification setting where candidate labels are all discrete, which cannot handle continuous labels with real values. In this paper, we provide the first attempt to investigate partial-label regression, where each training example is annotated with a set of real-valued candidate labels. To solve this problem, we first propose a simple baseline method that takes the average loss incurred by candidate labels as the predictive loss. The drawback of this method lies in that the loss incurred by the true label may be overwhelmed by other false labels. To overcome this drawback, we propose an identification method that takes the least loss incurred by candidate labels as the predictive loss. We further improve it by proposing a progressive identification method to differentiate candidate labels using progressively updated weights for incurred losses. We prove that the latter two methods are model-consistent and provide convergence analyses. Our proposed methods are theoretically grounded and can be compatible with any models, optimizers, and losses. Experiments validate the effectiveness of our proposed methods.
Machine Learning
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper primarily explores the problem of Partial-Label Regression (PLR). In traditional Partial-Label Learning (PLL), each training sample is annotated with a set of candidate labels, among which only one label is the true one. However, existing PLL methods can only handle discrete labels and are unable to deal with true labels that have continuous values. Therefore, this paper attempts to address the problem of partial-label regression with continuous-valued candidate labels for the first time. #### Specific Problems: 1. **Limitations of existing PLL methods**: Existing PLL methods can only handle discrete labels and cannot deal with continuous labels. 2. **Proposing new methods**: For the PLR problem, three methods are proposed: the averaging method, the identification method, and the progressive identification method. It is proven that these methods are theoretically consistent and converge to the optimal model. 3. **Experimental validation**: The effectiveness of the proposed methods is validated through seven benchmark datasets, and comparisons are made with fully supervised methods and other baseline methods. By addressing the PLR problem, this research fills a gap in weakly supervised learning and provides new solutions for handling data with continuous-valued candidate labels in real-world scenarios.