Revisiting Distillation for Continual Learning on Visual Question Localized-Answering in Robotic Surgery

Long Bai,Mobarakol Islam,Hongliang Ren
2023-07-22
Abstract:The visual-question localized-answering (VQLA) system can serve as a knowledgeable assistant in surgical education. Except for providing text-based answers, the VQLA system can highlight the interested region for better surgical scene understanding. However, deep neural networks (DNNs) suffer from catastrophic forgetting when learning new knowledge. Specifically, when DNNs learn on incremental classes or tasks, their performance on old tasks drops dramatically. Furthermore, due to medical data privacy and licensing issues, it is often difficult to access old data when updating continual learning (CL) models. Therefore, we develop a non-exemplar continual surgical VQLA framework, to explore and balance the rigidity-plasticity trade-off of DNNs in a sequential learning paradigm. We revisit the distillation loss in CL tasks, and propose rigidity-plasticity-aware distillation (RP-Dist) and self-calibrated heterogeneous distillation (SH-Dist) to preserve the old knowledge. The weight aligning (WA) technique is also integrated to adjust the weight bias between old and new tasks. We further establish a CL framework on three public surgical datasets in the context of surgical settings that consist of overlapping classes between old and new surgical VQLA tasks. With extensive experiments, we demonstrate that our proposed method excellently reconciles learning and forgetting on the continual surgical VQLA over conventional CL methods. Our code is publicly accessible.
Computer Vision and Pattern Recognition,Computation and Language,Robotics
What problem does this paper attempt to address?
This paper attempts to address the problem of Continual Learning (CL) in Visual Question Localization and Answering (VQLA) systems in robotic surgery, particularly how to avoid catastrophic forgetting of old knowledge while continuously learning new tasks. Specifically, the paper focuses on the following points: 1. **Catastrophic Forgetting**: When Deep Neural Networks (DNNs) learn new tasks or categories, their performance on old tasks significantly declines. This is especially severe in the medical field, where old data may be inaccessible due to privacy, storage, and licensing issues. 2. **Handling Overlapping Categories**: In practical applications, there may be overlapping categories between new and old tasks. Traditional continual learning methods may bias towards old categories when handling these overlapping categories, leading to poor learning outcomes for new categories. 3. **Multi-task Learning**: The VQLA system not only needs to provide textual answers but also highlight areas of interest to better understand the surgical scene. Therefore, the system needs to handle both classification and localization tasks simultaneously. To address these issues, the paper proposes a non-exemplary continual surgical VQLA framework (CS-VQLA) and introduces the following methods by revisiting distillation loss: - **Rigidity-Plasticity-Aware Distillation (RP-Dist)**: By adjusting the temperature parameter, the model achieves higher plasticity on overlapping categories while maintaining rigidity on non-overlapping categories. - **Self-Calibrated Heterogeneous Distillation (SH-Dist)**: Self-calibration operations are performed on intermediate feature maps to adapt to long-range contextual information. - **Weight Alignment (WA)**: Adjusts the weight bias between new and old categories to prevent the model from biasing towards new categories. Through these methods, the paper demonstrates that the proposed approach performs excellently in continual learning tasks on multiple public surgical datasets, effectively balancing the issues of learning and forgetting.