Uncertainty Voting Ensemble for Imbalanced Deep Regression

Yuchang Jiang,Vivien Sainte Fare Garnot,Konrad Schindler,Jan Dirk Wegner
2024-10-26
Abstract:Data imbalance is ubiquitous when applying machine learning to real-world problems, particularly regression problems. If training data are imbalanced, the learning is dominated by the densely covered regions of the target distribution and the learned regressor tends to exhibit poor performance in sparsely covered regions. Beyond standard measures like oversampling or reweighting, there are two main approaches to handling learning from imbalanced data. For regression, recent work leverages the continuity of the distribution, while for classification, the trend has been to use ensemble methods, allowing some members to specialize in predictions for sparser regions. In our method, named UVOTE, we integrate recent advances in probabilistic deep learning with an ensemble approach for imbalanced regression. We replace traditional regression losses with negative log-likelihood, which also predicts sample-wise aleatoric uncertainty. Our experiments show that this loss function handles imbalance better. Additionally, we use the predicted aleatoric uncertainty values to fuse the predictions of different expert models in the ensemble, eliminating the need for a separate aggregation module. We compare our method with existing alternatives on multiple public benchmarks and show that UVOTE consistently outperforms the prior art, while at the same time producing better-calibrated uncertainty estimates. Our code is available at <a class="link-external link-https" href="https://github.com/SherryJYC/UVOTE" rel="external noopener nofollow">this https URL</a>.
Machine Learning
What problem does this paper attempt to address?
This paper attempts to solve the problem of data imbalance in deep regression tasks. Specifically, when the training data is imbalanced, the learning process will be dominated by the densely covered regions in the target distribution, resulting in poor performance of the learned regressor in the sparsely covered regions. This paper proposes a method named UVOTE (Uncertainty Voting Ensemble). By integrating multiple expert models and using the predicted sample uncertainty to dynamically aggregate the prediction results of these expert models, it can effectively handle the data imbalance problem and improve the prediction performance of the model in sparse regions. ### Main contributions of the paper: 1. **Introduction of UVOTE**: This is a novel and efficient end - to - end method for handling unbalanced deep regression tasks. 2. **Performance improvement**: UVOTE outperforms all existing methods on four challenging datasets, not only reducing the regression error but also providing better calibrated uncertainty estimates. 3. **Innovative integration method**: UVOTE is the first unbalanced deep regression method that uses probabilistic deep learning for ensemble aggregation. ### Method overview: - **Multi - head architecture**: UVOTE adopts a design of a shared backbone encoder and multiple regression heads, with each regression head acting as a different expert model. - **Uncertainty prediction**: Each expert model predicts not only the target value but also the uncertainty associated with that prediction. - **Dynamic training**: Different expert models are trained with different sample weighting strategies to achieve diversity. - **Uncertainty - based expert aggregation**: In the inference stage, the output of the most reliable expert model is selected according to the predicted uncertainty, rather than simply averaging the predictions of all experts. ### Experimental results: - **Overall performance**: The overall performance of UVOTE on all four datasets is better than that of existing methods. - **Sparse region performance**: Especially in sparse regions (few - shot), the performance improvement of UVOTE is the most significant, with the improvement range from 43% to 20% compared to the baseline model (Vanilla). - **Uncertainty estimate**: The uncertainty estimate provided by UVOTE is more calibrated than that of the baseline model, especially in sparse regions. ### Conclusion: By combining probabilistic deep learning and integration methods, UVOTE effectively solves the problem of unbalanced data in deep regression tasks, not only improving the overall performance but also achieving a significant improvement in the performance in sparse regions. In addition, the uncertainty estimate of UVOTE is more reliable and is suitable for downstream tasks that rely on regression outputs.