MedFCT: A Frequency Domain Joint CNN-Transformer Network for Semi-supervised Medical Image Segmentation

Shiao Xie,Huimin Huang,Ziwei Niu,Lanfen Lin,Yen-Wei Chen
DOI: https://doi.org/10.1109/ICME55011.2023.00328
2023-01-01
Abstract:Semi-supervised learning(SSL) is a data-efficient way in leveraging large-scale data without annotations and alleviating the dependence on labeled data. Mean-Teacher (MT) scheme with teacher-student model architecture has shown its effectiveness in semi-supervised medical image segmentation, where the student network learns from the teacher by minimizing pixel-wise consistency loss. However, existing MT-based SSLs still give rise to two main concerns: (1) limited learning ability of student network that neglects the union of local feature and global cues extraction which may impact the representation learning of variable objects. (2) limited knowledge-transferring ability of teacher network with only pixel-level consistency regularization that may result in inadequate and unstable guidance. To address these limitations, we propose a novel semi-supervised learning scheme, namely MedFCT, with two appealing designs: (1) A dual student architecture with parallel CNN and Transformer branches is designed for local-global feature extraction, where the full-frequency interaction between CNN and Transformer can be explored by a frequency domain cross-fusion (FDCF) module to learn complementarity of the two-paradigm features. (2) A comprehensive multi-level consistency regularization considering pixel-wise, feature-wise and class-wise information is presented to realize more effective guidance and knowledge transfer from teacher network. Experiments show that MedFCT outperforms previous state-of-the-art methods on two public medical image segmentation benchmarks.
What problem does this paper attempt to address?