Abstract:We start from a simple asymptotic result for the problem of on-line regression with the quadratic loss function: the class of continuous limited-memory prediction strategies admits a "leading prediction strategy", which not only asymptotically performs at least as well as any continuous limited-memory strategy but also satisfies the property that the excess loss of any continuous limited-memory strategy is determined by how closely it imitates the leading strategy. More specifically, for any class of prediction strategies constituting a reproducing kernel Hilbert space we construct a leading strategy, in the sense that the loss of any prediction strategy whose norm is not too large is determined by how closely it imitates the leading strategy. This result is extended to the loss functions given by Bregman divergences and by strictly proper scoring rules.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the competitive strategy problem in online prediction. Specifically: 1. **Asymptotic Results in Online Regression**: The paper first starts from a simple asymptotic result and discusses the online regression problem with a squared - loss function. For the class of continuous finite - memory prediction strategies, there exists a "leading prediction strategy", which not only asymptotically performs at least as well as any continuous finite - memory strategy, but also the additional loss of other strategies depends on the degree to which they imitate the leading strategy. 2. **Leading Strategy in Reproducing Kernel Hilbert Space (RKHS)**: For the class of prediction strategies that form a reproducing kernel Hilbert space, the paper constructs a leading strategy such that the loss of any prediction strategy with not - too - large norm is determined by the degree to which it imitates the leading strategy. 3. **Extension to a Wider Range of Loss Functions**: This result is generalized to loss functions given by Bregman divergence and strictly proper scoring rules. 4. **Competitiveness in Online Prediction**: The paper emphasizes that online prediction usually avoids making any stochastic assumptions about how the observations are generated, but in some cases also considers randomly generated observations. 5. **Application of Defensive Prediction**: The paper uses the defensive prediction method to construct master strategies, which automatically satisfy the stronger properties required by the leading strategy. 6. **Relationships among the Successes of Different Prediction Strategies**: The paper also explores the relationships among successful prediction strategies, especially in the form of Jeffreys's law, indicating that successful prediction strategies will tend to converge. In summary, the core problem of this paper is to construct and analyze leading strategies that can outperform or approach the performance of the optimal prediction strategy and extend their application range to a wider range of prediction scenarios and loss functions. This helps to improve the accuracy and robustness of online prediction, especially when not making too many assumptions about the data - generation mechanism. ### Key Formulas - Squared - loss function: \[ \lambda(y, \mu)=(y - \mu)^2 \] - Bregman divergence: \[ d_{\Psi, \Psi'}(y, z):=\Psi(y)-\Psi(z)-\Psi'(z)(y - z) \] - Relative entropy (Kullback - Leibler divergence): \[ D(y \| z):=y \ln\frac{y}{z}+(1 - y)\ln\frac{1 - y}{1 - z} \] - Key inequality in Jeffreys's law: \[ \left|\sum_{n = 1}^N\lambda(y_n, \mu_n)+\sum_{n = 1}^N d_\lambda(\mu_n, \phi_n)-\sum_{n = 1}^N\lambda(y_n, \phi_n)\right| \leq\sqrt{c_F^2 + 1}\left(\left\|\text{Exp}_\lambda(F)\right\|_F+\left\|\text{Exp}_\lambda\right\|_{C(P)}\right)\sqrt{N} \] These formulas are used in the paper to describe the performance and performance evaluation of prediction strategies.

Leading strategies in competitive on-line prediction

Adversarial Prediction Games for Multivariate Losses

Electronic structure study by means of X-ray spectroscopy and theoretical calculations of the "ferric star" single molecule magnet

Competing with Markov prediction strategies

Predictions as statements and decisions

Continuous Prediction with Experts' Advice

Online Learning: Sufficient Statistics and the Burkholder Method

Online Prediction With History-Dependent Experts: The General Case

Online Optimization with Predictions and Non-convex Losses

No-Regret Online Prediction with Strategic Experts

Sequential prediction of individual sequences under general loss functions

Online Learning with Primary and Secondary Losses

Online Convex Optimization with Memory and Limited Predictions

Online Classification with Predictions

Sequential optimizing strategy in multi-dimensional bounded forecasting games

Online Optimization With Predictions and Switching Costs: Fast Algorithms and the Fundamental Limit

Competitive Machine Learning: Best Theoretical Prediction vs Optimization

Learning Predictions for Algorithms with Predictions

Metric entropy in competitive on-line prediction

Learnability in Online Kernel Selection with Memory Constraint via Data-dependent Regret Analysis

High-Dimensional Prediction for Sequential Decision Making