Abstract:In this paper, we propose a new variant of Linear Discriminant Analysis (LDA) to solve multi-label classification tasks. The proposed method is based on a probabilistic model for defining the weights of individual samples in a weighted multi-label LDA approach. Linear Discriminant Analysis is a classical statistical machine learning method, which aims to find a linear data transformation increasing class discrimination in an optimal discriminant subspace. Traditional LDA sets assumptions related to Gaussian class distributions and single-label data annotations. To employ the LDA technique in multi-label classification problems, we exploit intuitions coming from a probabilistic interpretation of class saliency to redefine the between-class and within-class scatter matrices. The saliency-based weights obtained based on various kinds of affinity encoding prior information are used to reveal the probability of each instance to be salient for each of its classes in the multi-label problem at hand. The proposed Saliency-based weighted Multi-label LDA approach is shown to lead to performance improvements in various multi-label classification problems.
What problem does this paper attempt to address?
This paper attempts to solve several key problems in multi - label classification tasks, as follows:
1. **Limitations of Traditional LDA in Multi - label Classification**:
- Traditional Linear Discriminant Analysis (LDA) assumes that the data is single - labeled and that the class distribution conforms to a Gaussian distribution. However, in multi - label classification, each sample can belong to multiple classes, and the number of samples in different classes may be significantly imbalanced.
- Directly applying traditional LDA to multi - label classification will lead to the problem of repeated calculation of sample contributions, thus affecting the classification performance.
2. **Class Imbalance and Sample Importance Differences**:
- In multi - label classification, the number of samples in different classes may be extremely unbalanced, which will cause the samples of some classes to have too much or too little influence on the model.
- A method is needed to estimate the contribution of each sample to its class in order to balance the differences between classes and improve the classification effect.
3. **Class Label Correlation and Utilization of Feature Information**:
- There is usually a correlation between class labels in multi - label classification. How to effectively use these correlations to improve the classification results is an important problem.
- At the same time, it is necessary to combine the feature information of samples to enhance the performance of the classifier.
In order to solve the above problems, the paper proposes a Saliency - based Weighted Multi - label Linear Discriminant Analysis method (SwMLDA). The main innovations of this method include:
- **Introducing Saliency Estimation**: Through the probability saliency estimation method, calculate the saliency weight of each sample to its class, so as to better reflect the importance of the sample.
- **Handling Class Imbalance Problems**: The saliency weight can reduce the impact of class - imbalanced data sets on the classification results.
- **Combining Label and Feature Information**: Use multiple prior weight factors (such as binary, misclassification, entropy, fuzziness, dependence and correlation) to integrate label and feature information into the calculation of the divergence matrix and improve the classification effect.
Through these improvements, SwMLDA can achieve significant performance improvements on multiple public multi - label data sets.
### Formula Summary
1. **Saliency Weight Matrix \( P \)**:
\[
P = [p_1,\ldots,p_i,\ldots,p_N]=[p^{(1)},\ldots,p^{(j)},\ldots,p^{(C)}]^T
\]
where \( p_i\in\mathbb{R}^C \) represents the optimal weight vector of the \( i \) - th sample, and \( p^{(j)}\in\mathbb{R}^N \) represents the weight vector of the \( j \) - th class.
2. **Redefinition of the Divergence Matrix**:
\[
S_b = X\left(P^T P-\frac{1}{n}\hat{p}^T\hat{p}\right)X^T
\]
\[
S_t = X\left(\text{diag}(\hat{p})-\frac{1}{n}\hat{p}^T\hat{p}\right)X^T
\]
where \( \hat{p}=\sum_{c = 1}^C p^{(c)} \), and the weight vectors are normalized so that the sum of the weights of each class is 1.
3. **Saliency Score Vector \( p^* \)**:
\[
p^* = H^{-1}1
\]
where \( H = D - W+V \), \( D \) is a diagonal matrix, \( W \) is a similarity matrix, and \( V \) is a prior information matrix.
Through these formulas and methods, SwMLDA can more effectively handle class imbalance and sample importance differences in multi - label classification problems, thereby improving classification performance.