Review of Research on Application of Vision Transformer in Medical Image Analysis

SHI Lei,JI Qingyu,CHEN Qingwei,ZHAO Hengyi,ZHANG Junxing
DOI: https://doi.org/10.3778/j.issn.1002-8331.2206-0022
2023-01-01
Abstract:Deep self-attentive network(Transformer)has a natural ability to model global features and long-range correlations of input information, which is strongly complementary to the inductive bias property of convolutional neural networks(CNN). Inspired by its great success in natural language processing, Transformer has been widely introduced into various computer vision tasks, especially medical image analysis, and has achieved remarkable performance. In this paper, it first introduces the typical work of vision Transformer on natural images, and then organizes and summarizes the related work according to different lesions or organs in the subfields of medical image segmentation, medical image classification and medical image registration, focusing on the implementation ideas of some representative work. Finally, current researches are discussed and the future direction is pointed out. The purpose of this paper is to provide a reference for further in-depth research in this field.
What problem does this paper attempt to address?