Deep Bayesian Active Learning-to-Rank with Relative Annotation for Estimation of Ulcerative Colitis Severity

Takeaki Kadota,Hideaki Hayashi,Ryoma Bise,Kiyohito Tanaka,Seiichi Uchida
2024-09-10
Abstract:Automatic image-based severity estimation is an important task in computer-aided diagnosis. Severity estimation by deep learning requires a large amount of training data to achieve a high performance. In general, severity estimation uses training data annotated with discrete (i.e., quantized) severity labels. Annotating discrete labels is often difficult in images with ambiguous severity, and the annotation cost is high. In contrast, relative annotation, in which the severity between a pair of images is compared, can avoid quantizing severity and thus makes it easier. We can estimate relative disease severity using a learning-to-rank framework with relative annotations, but relative annotation has the problem of the enormous number of pairs that can be annotated. Therefore, the selection of appropriate pairs is essential for relative annotation. In this paper, we propose a deep Bayesian active learning-to-rank that automatically selects appropriate pairs for relative annotation. Our method preferentially annotates unlabeled pairs with high learning efficiency from the model uncertainty of the samples. We prove the theoretical basis for adapting Bayesian neural networks to pairwise learning-to-rank and demonstrate the efficiency of our method through experiments on endoscopic images of ulcerative colitis on both private and public datasets. We also show that our method achieves a high performance under conditions of significant class imbalance because it automatically selects samples from the minority classes.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve several key challenges in the automatic estimation of the severity of ulcerative colitis (UC) based on endoscopic images. Specifically, it attempts to solve the following problems: 1. **Difficulties and high cost of absolute annotation**: - Absolute annotation refers to directly assigning a discrete severity label to each image. However, this annotation method is very difficult when dealing with images with ambiguous severity, and it requires a great deal of expert time and effort. - Research shows that absolute annotation not only has high variability among different experts, but also has inconsistency even at different time points of the same expert. 2. **Selection problems of relative annotation**: - Relative annotation simplifies the annotation process and reduces subjective bias by comparing the severity between a pair of images instead of quantifying the severity. - However, a major problem faced by relative annotation is that as the number of images increases, the number of possible image pairs grows quadratically (i.e., \(N(N - 1)/2\)), resulting in a huge amount of annotation work. Therefore, it is crucial to select appropriate image pairs for annotation. 3. **Class imbalance problem**: - Medical image datasets usually have the problem of class imbalance, that is, the number of normal or mildly diseased images is much larger than that of severely diseased images. This may cause the model to be biased towards the majority class during training, affecting the learning effect of minority class samples. ### Proposed method To address the above challenges, the authors propose a deep Bayesian active learning - to - rank method based on Bayesian convolutional neural network (Bayesian CNN). The main features of this method include: - **Uncertainty estimation**: Use Bayesian CNN to estimate the uncertainty of samples through MC dropout technique, so as to select the most learning - efficient image pairs for annotation. - **Active learning framework**: Through the active learning framework, gradually select unannotated image pairs with high uncertainty and have them relatively annotated by medical experts. - **Theoretical proof**: The authors provide a theoretical basis to prove that MC dropout can be effectively applied to the pairwise learning tasks of Siamese network structure to estimate uncertainty. - **Experimental verification**: The effectiveness of this method is verified through private and public UC endoscopic image datasets. In particular, in the case of significant class imbalance, it can give priority to selecting important samples from the minority class. ### Main contributions 1. An active learning method that introduces Bayesian CNN into the learning - to - rank framework is proposed, which solves the image pair selection problem in relative annotation. 2. It is theoretically proven that MC dropout is applicable to estimating uncertainty in pairwise learning tasks based on Bayesian Siamese neural network. 3. Experimentally, it is verified that this method improves the performance of estimating the severity of UC endoscopic images with fewer image pairs, and shows its generalization ability on public datasets. 4. It is proven that this method is effective in multi - classification tasks, especially in dividing images into discrete severity levels, which is a common task in medical image diagnosis. Through these contributions, this paper provides an efficient and accurate solution for estimating the severity of ulcerative colitis based on endoscopic images, while reducing the annotation cost and improving the model performance.