Ordinal Multiple-instance Learning for Ulcerative Colitis Severity Estimation with Selective Aggregated Transformer

Kaito Shiku,Kazuya Nishimura,Daiki Suehiro,Kiyohito Tanaka,Ryoma Bise
2024-11-22
Abstract:Patient-level diagnosis of severity in ulcerative colitis (UC) is common in real clinical settings, where the most severe score in a patient is recorded. However, previous UC classification methods (i.e., image-level estimation) mainly assumed the input was a single image. Thus, these methods can not utilize severity labels recorded in real clinical settings. In this paper, we propose a patient-level severity estimation method by a transformer with selective aggregator tokens, where a severity label is estimated from multiple images taken from a patient, similar to a clinical setting. Our method can effectively aggregate features of severe parts from a set of images captured in each patient, and it facilitates improving the discriminative ability between adjacent severity classes. Experiments demonstrate the effectiveness of the proposed method on two datasets compared with the state-of-the-art MIL methods. Moreover, we evaluated our method in real clinical settings and confirmed that our method outperformed the previous image-level methods. The code is publicly available at <a class="link-external link-https" href="https://github.com/Shiku-Kaito/Ordinal-Multiple-instance-Learning-for-Ulcerative-Colitis-Severity-Estimation" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to estimate the severity of ulcerative colitis (UC) by using patient - level diagnosis records without image - level labels. Specifically, the author proposes a new method that can estimate patient - level severity labels from multiple endoscopic images of each patient within the framework of multiple - instance learning (MIL). This method aims to solve the problem that existing UC classification methods rely only on a single image for estimation and make full use of the patient - level severity data actually recorded in the clinical environment. ### Main problems and challenges 1. **Ordinal relationship**: There is an ordinal relationship (i.e., hierarchical relationship) among UC severity labels, and existing MIL methods are usually unable to handle this relationship well. 2. **Maximum severity estimation**: In clinical practice, the patient - level severity is determined by the most severe part of all images. This means that it is necessary to accurately extract the most severe features from multiple images, rather than simply aggregating the features of all images. 3. **Distinguishing adjacent categories**: Since MIL methods usually aggregate the features of all instances, including those with lower severity, this will lead to difficulty in distinguishing adjacent severity categories. ### Solutions To solve the above problems, the author proposes a Selective Aggregated Transformer for Ordinal Multiple - Instance Learning (SATOMIL). This model introduces multiple selective aggregator tokens, and each token is responsible for aggregating instance features above a specific severity level. Specifically: - The model introduces \(k - 1\) selective aggregator tokens, and each token \(t_k\) aggregates instance features that satisfy \(Y_i>k\). - This design enables each token to effectively aggregate severe instance features from different severity categories, thereby generating discriminative bag - level features. - In this way, the model can better distinguish adjacent severity categories and improve classification accuracy. ### Experimental results The experimental results show that SATOMIL outperforms existing MIL methods on both datasets. In addition, ablation experiments also prove the effectiveness of this method, especially when considering the ordinal relationship of categories, the performance is significantly better than the method of simply introducing an ordinal classification algorithm. ### Summary The main contributions of this paper are: 1. Proposing a method for estimating UC severity using only the existing patient - level diagnosis records in the clinical environment without additional image - level annotations. 2. Designing a selective aggregation Transformer specifically for solving the maximum severity estimation problem in ordinal MIL. 3. The experimental results verify the effectiveness of this method, especially its performance in the real clinical environment is better than existing methods. Through these improvements, this method not only improves the accuracy of UC severity estimation but also provides new ideas for other similar medical image analysis tasks.