iSegFormer: Interactive Segmentation via Transformers with Application to 3D Knee MR Images

Qin Liu,Zhenlin Xu,Yining Jiao,Marc Niethammer
DOI: https://doi.org/10.48550/arXiv.2112.11325
2021-12-21
Computer Vision and Pattern Recognition
Abstract:We propose iSegFormer, a memory-efficient transformer that combines a Swin transformer with a lightweight multilayer perceptron (MLP) decoder. With the efficient Swin transformer blocks for hierarchical self-attention and the simple MLP decoder for aggregating both local and global attention, iSegFormer learns powerful representations while achieving high computational efficiencies. Specifically, we apply iSegFormer to interactive 3D medical image segmentation.
What problem does this paper attempt to address?