Abstract:In many structured prediction problems, complex relationships between variables are compactly defined using graphical structures. The most prevalent graphical prediction methods---probabilistic graphical models and large margin methods---have their own distinct strengths but also possess significant drawbacks. Conditional random fields (CRFs) are Fisher consistent, but they do not permit integration of customized loss metrics into their learning process. Large-margin models, such as structured support vector machines (SSVMs), have the flexibility to incorporate customized loss metrics, but lack Fisher consistency guarantees. We present adversarial graphical models (AGM), a distributionally robust approach for constructing a predictor that performs robustly for a class of data distributions defined using a graphical structure. Our approach enjoys both the flexibility of incorporating customized loss metrics into its design as well as the statistical guarantee of Fisher consistency. We present exact learning and prediction algorithms for AGM with time complexity similar to existing graphical models and show the practical benefits of our approach with experiments.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the trade - off between the two existing main methods in structured prediction tasks - probabilistic graphical models (such as Conditional Random Fields (CRFs)) and large - margin methods (such as Structural Support Vector Machines (SSVMs)). Specifically: - **Conditional Random Fields (CRFs)**: These models have Fisher consistency, that is, under ideal learning conditions (using the true data distribution and a fully expressive feature representation), they can produce predictions that minimize the expected loss. However, CRFs cannot integrate custom - made evaluation loss metrics during the training process. - **Structural Support Vector Machines (SSVMs)**: This type of model can directly integrate custom - made evaluation loss metrics in the training optimization process, but lacks the Fisher consistency guarantee in the multi - class setting. To overcome the limitations of these two methods, the paper proposes **Adversarial Graphical Models (AGM)**, which is a distribution - robust method aiming to construct a predictor that is robust to a class of data distributions defined by graph structures. The AGM method not only has the flexibility to integrate custom - made loss metrics but also provides a statistical Fisher consistency guarantee. ### Main contributions of the paper 1. **Proposing Adversarial Graphical Models (AGM)**: - AGM looks for a predictor through a robust adversarial formula, which minimizes a loss metric in the worst - case scenario given the statistical summary of the empirical distribution. - This method allows the replacement of the empirical training data with an adversary, which can freely choose the evaluation distribution in the set of distributions that match the statistical summary of the empirical training data. 2. **Theoretical guarantees**: - The AGM framework accepts multiple loss metrics and provides a statistical Fisher consistency guarantee for the selected loss metrics. - Through the robust adversarial formula, AGM more closely aligns the training objective with the evaluation loss metric while maintaining convexity. 3. **Efficient algorithms**: - The paper proposes exact learning and prediction algorithms for low - tree - width graph structures, with a time complexity similar to existing graph models. - Experimental results show that AGM outperforms previous models in structured prediction tasks. ### Mathematical formulas - **Adversarial prediction method**: \[ \min_{\hat{P}(\hat{y}|x)} \max_{\check{P}(\check{y}|x)} \mathbb{E}_{X \sim \tilde{P}; \hat{Y}|X \sim \hat{P}; \check{Y}|X \sim \check{P}}[\text{loss}(\hat{Y}, \check{Y})] \] where: \[ \mathbb{E}_{X \sim \tilde{P}; \check{Y}|X \sim \check{P}}[\Phi(X, \check{Y})] = \tilde{\Phi} \] - **Bi - optimization problem**: \[ \min_{\theta_e, \theta_v} \mathbb{E}_{X, Y \sim \tilde{P}} \max_{\check{P}(\check{y}|x)} \min_{\hat{P}(\hat{y}|x)} \left[ \sum_i \sum_{\hat{y}_i, \check{y}_i} \hat{P}(\hat{y}_i|x) \check{P}(\check{y}_i|x) \text{loss}(\hat{y}_i, \check{y}_i) + \cdots \right] \] ### Experimental verification The paper carried out experimental verification on two different tasks: 1. **Facial emotion intensity prediction**: - The task is to predict the emotion intensity of each image given a series of facial images. - The emotion intensity labels are divided into three ordered categories: neutral < increasing < peak.

Distributionally Robust Graphical Models

Unsupervised Adversarially-Robust Representation Learning on Graphs

Multi-view Robust Graph Representation Learning for Graph Classification

Stable Graphical Models

Joint Maximum Margin and Maximum Entropy Learning of Graphical Models

Large Margin Boltzmann Machines and Large Margin Sigmoid Belief Networks

Efficient Learning of Discrete Graphical Models

Distributionally Robust Graph Learning from Smooth Signals under Moment Uncertainty

Learning Graphical Models from a Distributed Stream

Distributionally Robust Skeleton Learning of Discrete Bayesian Networks

Towards Robust Recommendation via Decision Boundary-aware Graph Contrastive Learning

Distributionally Robust Optimization with Probabilistic Group

Obtaining Explainable Classification Models using Distributionally Robust Optimization

Robust Causal Graph Representation Learning Against Confounding Effects

Robust Graph Learning Under Wasserstein Uncertainty

Distributional regression: CRPS-error bounds for model fitting, model selection and convex aggregation

Distributionally Robust Semi-Supervised Learning Over Graphs

Learning Latent Variable Gaussian Graphical Models

uGLAD: Sparse graph recovery by optimizing deep unrolled networks

Graph Robustness Benchmark: Benchmarking the Adversarial Robustness of Graph Machine Learning.

Evaluating Robustness and Uncertainty of Graph Models Under Structural Distributional Shifts