Abstract:In many real-world prediction tasks, class labels contain information about the relative order between labels that are not captured by commonly used loss functions such as multicategory cross-entropy. Recently, the preference for unimodal distributions in the output space has been incorporated into models and loss functions to account for such ordering information. However, current approaches rely on heuristics that lack a theoretical foundation. Here, we propose two new approaches to incorporate the preference for unimodal distributions into the predictive model. We analyse the set of unimodal distributions in the probability simplex and establish fundamental properties. We then propose a new architecture that imposes unimodal distributions and a new loss term that relies on the notion of projection in a set to promote unimodality. Experiments show the new architecture achieves top-2 performance, while the proposed new loss term is very competitive while maintaining high unimodality.

What problem does this paper attempt to address?

The problems that this paper attempts to solve are as follows: In the ordinal regression task, the existing methods fail to fully utilize the order information between class labels, especially failing to effectively generate unimodal distributions. Specifically: 1. **Limitations of Existing Methods**: - Commonly used loss functions such as the multicategory cross - entropy loss function are unable to capture the relative order information between class labels. - Most of the current methods relying on unimodal distributions are heuristic - based and lack a theoretical foundation. 2. **Goals of the Paper**: - Propose two new methods to incorporate unimodal distribution preferences into the prediction model in order to better handle the order information between class labels. - Force or promote the unimodality of the output probability distribution through a new neural network architecture and a new loss term. 3. **Specific Problem Description**: - The data features in the ordinal regression task correspond to a set of ordered class labels \( C=\{c_1, c_2,\ldots, c_K\} \), where \( c_1 < c_2 <\cdots < c_K \). - The goal is to find a reliable regression function \( h:X\rightarrow C \), mapping from the feature domain \( X \) to the ordered label domain \( C \). - Existing methods such as the multicategory cross - entropy loss function have limitations when dealing with ordinal data because they only focus on maximizing the probability of the true category, ignoring other probabilities, and do not constrain the model to generate a unimodal probability distribution. 4. **Solutions**: - A new neural network architecture is proposed, which forces the output to be a unimodal distribution. - A new loss term is proposed, which promotes unimodal distribution through the concept of projection, thereby better utilizing the order information between class labels. Through these improvements, the paper aims to improve the performance of the ordinal regression task, especially in terms of accuracy and unimodality.

Unimodal Distributions for Ordinal Regression

Unsupervised Feature Selection with Ordinal Locality.

Unimodal-Concentrated Loss: Fully Adaptive Label Distribution Learning for Ordinal Regression

Distributed Ordinal Regression Over Networks

Improving the classification of extreme classes by means of loss regularisation and generalised beta distributions

Ordinal classification for interval-valued data and interval-valued functional data

Ord2Seq: Regarding Ordinal Regression As Label Sequence Prediction

Zero-shot Learning with Regularized Cross-Modality Ranking.

Structured Mixture of Continuation-ratio Logits Models for Ordinal Regression

Neighborhood preserving ordinal regression

Regularization-Based Methods for Ordinal Quantification

Cumulative Sum Ranking

Deck of Cards method for Hierarchical, Robust and Stochastic Ordinal Regression

Transductive Ordinal Regression

Remarks on Loss Function of Threshold Method for Ordinal Regression Problem

THOR: Threshold-Based Ranking Loss for Ordinal Regression

Calibration of ordinal regression networks

A mixture distribution for modelling bivariate ordinal data

A constrained regression model for an ordinal response with ordinal predictors

Multinomial Restricted Unfolding

Multivariate probability distribution for categorical and ordinal random variables