Enhancing Nasopharyngeal Carcinoma Classification Based on Multi-View Cross-Modal Knowledge Distillation

Zhengjie Zhang,Crystal Cai,Sijia Du,Suncheng Xiang,Dahong Qian
DOI: https://doi.org/10.1109/isbi56570.2024.10635562
2024-01-01
Abstract:This study introduces a novel multi-view cross-modal knowledge distillation approach to enhance the diagnostic accuracy of white light imaging-based (WLI) nasopharyngeal carcinoma (NPC) classification. By leveraging metric learning and multi-view representations, the teacher network captures comprehensive details from the endoscopic images. The subsequent weighted cross-modal distillation effectively transfers knowledge to the WLI-trained student network, addressing the challenge of non-perfectly matched multi-view data. We conducted experiments on a dataset of 5,141 images from 633 patients to validate the proposed approach, achieving a classification performance for WLI images that is comparable to NBI images. Additionally, our approach proves itself highly adaptable to various knowledge distillation algorithms, offering significant optimization for multi-view distillation tasks. Code is available at https://github.com/krystal21/MV-CMD.
What problem does this paper attempt to address?