Abstract:We study a variant of online multiclass classification where the learner predicts a single label but receives a \textit{set of labels} as feedback. In this model, the learner is penalized for not outputting a label contained in the revealed set. We show that unlike online multiclass learning with single-label feedback, deterministic and randomized online learnability are \textit{not equivalent} even in the realizable setting with set-valued feedback. Accordingly, we give two new combinatorial dimensions, named the Set Littlestone and Measure Shattering dimension, that tightly characterize deterministic and randomized online learnability respectively in the realizable setting. In addition, we show that the Measure Shattering dimension characterizes online learnability in the agnostic setting and tightly quantifies the minimax regret. Finally, we use our results to establish bounds on the minimax regret for three practical learning settings: online multilabel ranking, online multilabel classification, and real-valued prediction with interval-valued response.

What problem does this paper attempt to address?

### Problems Addressed by the Paper This paper aims to study a variant of the online multi-classification problem, where the learner predicts a single label each time but receives a set of labels as feedback. Specifically, the learner incurs a loss only when the predicted label is not in the feedback set. #### Main Contributions 1. **Separation of Deterministic and Stochastic Learnability**: - In the case of set feedback, deterministic learning and stochastic learning are not equivalent in the realizable setting. - This is different from online learning with single-label feedback, where deterministic and stochastic learning are equivalent in the realizable setting. - If the Helly number of the set system is finite, deterministic and stochastic realizable learning are equivalent. 2. **New Combinatorial Dimensions**: - Two new combinatorial dimensions are proposed: Set Littlestone dimension and Measure Shattering dimension, which characterize deterministic and stochastic realizable learning, respectively. - These dimensions are defined through infinite-width complete trees, differing from existing combinatorial dimensions. 3. **Characterization in the Non-Realizable Setting**: - The Measure Shattering dimension continues to characterize online learning ability in the non-realizable setting. - Stochastic realizable learning and non-realizable learning are equivalent. 4. **Practical Applications**: - Using these results, a minimax expected loss bound analysis is conducted for three practical learning scenarios: online multi-label ranking, online multi-label classification, and real-valued prediction with interval feedback. ### Theoretical Contributions - **Relationship of Combinatorial Dimensions**: - The Set Littlestone dimension and Measure Shattering dimension are defined, and their relationship is demonstrated. - When the set system has a finite Helly number, these dimensions are equivalent. - **Separation of Deterministic and Stochastic Learning**: - It is proven that in certain cases, there exist stochastic learning algorithms that can successfully learn, while no deterministic algorithm can succeed. - **Design of Learning Algorithms**: - A standard optimal algorithm for deterministic learning is proposed, extending the classical Littlestone standard optimal algorithm. - For stochastic learning, a multi-scale online learning algorithm is designed, incorporating algorithm chaining techniques. ### Practical Applications - **Online Multi-Label Ranking**: The learner needs to rank labels based on the relevance of instances but only receives feedback indicating which labels are relevant. - **Online Multi-Label Classification**: The learner needs to classify multiple labels and is penalized only when the number of classification errors reaches a certain threshold. - **Real-Valued Prediction with Interval Feedback**: The learner needs to predict a real number, but the feedback is an interval. In summary, this paper addresses significant theoretical and practical issues in the online multi-classification problem by introducing new combinatorial dimensions and algorithms.

Online Learning with Set-Valued Feedback

Multiclass Online Learnability under Bandit Feedback

Learning-augmented Algorithms for Online Subset Sum

Online Learning: Stochastic and Constrained Adversaries

Combinatorial Bandits with Relative Feedback

Online Ranking with Top-1 Feedback

A Combinatorial Characterization of Supervised Online Learnability

On the Learnability of Multilabel Ranking

Online Learning with Feedback Graphs: Beyond Bandits

Bandit-Feedback Online Multiclass Classification: Variants and Tradeoffs

Online Classification with Predictions

Online Learning to Rank with Top-k Feedback

Apple Tasting: Combinatorial Dimensions and Minimax Rates

Optimal Learners for Realizable Regression: PAC Learning and Online Learning

Online Learning from Strategic Human Feedback in LLM Fine-Tuning

Equal Opportunity in Online Classification with Partial Feedback

Online Learning with Composite Loss Functions

Bounds on the price of feedback for mistake-bounded online learning

On Adaptivity in Information-constrained Online Learning

The Interplay Between Stability and Regret in Online Learning

LEARN: An Invex Loss for Outlier Oblivious Robust Online Optimization