Splitting expands the application range of Vision Transformer -- variable Vision Transformer (vViT)

Takuma Usuzaki
DOI: https://doi.org/10.48550/arXiv.2211.03992
2022-11-08
Quantitative Methods
Abstract:Vision Transformer (ViT) has achieved outstanding results in computer vision. Although there are many Transformer-based architectures derived from the original ViT, the dimension of patches are often the same with each other. This disadvantage leads to a limited application range in the medical field because in the medical field, datasets whose dimension is different from each other; e.g. medical image, patients' personal information, laboratory test and so on. To overcome this limitation, we develop a new derived type of ViT termed variable Vision Transformer (vViT). The aim of this study is to introduce vViT and to apply vViT to radiomics using T1 weighted magnetic resonance image (MRI) of glioma. In the prediction of 365 days of survival among glioma patients using radiomics,vViT achieved 0.83, 0.82, 0.81, and 0.76 in sensitivity, specificity, accuracy, and AUC-ROC, respectively. vViT has the potential to handle different types of medical information at once.
What problem does this paper attempt to address?