Abstract:This work addresses the problem of high-dimensional classification by exploring the generalized Bayesian logistic regression method under a sparsity-inducing prior distribution. The method involves utilizing a fractional power of the likelihood resulting the fractional posterior. Our study yields concentration results for the fractional posterior, not only on the joint distribution of the predictor and response variable but also for the regression coefficients. Significantly, we derive novel findings concerning misclassification excess risk bounds using sparse generalized Bayesian logistic regression. These results parallel recent findings for penalized methods in the frequentist literature. Furthermore, we extend our results to the scenario of model misspecification, which is of critical importance.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to use the sparse generalized Bayesian logistic regression method to improve classification performance in high - dimensional classification. Specifically, the paper focuses on how, in the case where the data dimension is much larger than the sample size, to introduce a sparse prior distribution and use the fractional posterior to achieve effective classification, and derive results similar to those in the frequentist literature, especially in terms of misclassification excess risk bounds.
### Main Contributions
1. **Concentration Properties of the Fractional Posterior**: The paper provides concentration results of the fractional posterior under different metrics, including α - Rényi divergence, Hellinger distance, and total variation distance. These results are applicable not only to the joint distribution but also to the distribution of regression coefficients.
2. **Misclassification Excess Risk Bounds**: The paper derives the misclassification excess risk bounds of sparse generalized Bayesian logistic regression in high - dimensional classification, and these results are comparable to those in the frequentist literature.
3. **Extension in the Case of Model Misspecification**: The paper further explores the concentration properties of the fractional posterior and misclassification excess risk bounds in the case of model misspecification.
### Methods
- **Fractional Posterior**: Construct the fractional posterior by using the fractional power of the likelihood function, which helps to deal with the problem of model misspecification.
- **Sparse Prior**: Use a heavy - tailed distribution (such as the scaled Student's t - distribution) as a prior to induce sparsity.
- **Technical Tools**: Utilize technical tools such as PAC - Bayesian inequalities to derive concentration rates.
### Results
- **Concentration Results**: The paper proves the concentration results of the fractional posterior under α - Rényi divergence, and under certain conditions, these results can be transformed into concentration results under Hellinger distance and total variation distance.
- **Misclassification Excess Risk**: The paper derives the misclassification excess risk bounds of sparse generalized Bayesian logistic regression in high - dimensional classification, and these results are comparable to those in the frequentist literature.
- **Model Misspecification**: The paper also explores the concentration results and misclassification excess risk bounds in the case of model misspecification.
### Discussion
- **Computational Aspects**: Although the paper mainly focuses on theoretical properties, it also briefly discusses the computational aspects of the fractional posterior, especially the advantages of using the Langevin Monte Carlo method for sampling.
- **Prior Selection**: The paper compares the effects of different priors (such as the scaled Student's t - distribution and the spike - and - slab prior), and points out that although the scaled Student's t - distribution is conducive to sparsity, it lacks variable selection ability.
### Conclusion
By introducing the sparse generalized Bayesian logistic regression method, the paper successfully addresses the challenges in high - dimensional classification, especially achieving results comparable to those in the frequentist literature in terms of misclassification excess risk bounds. These results are not only theoretically significant but also provide a new perspective for practical applications.