Abstract:Deep learning has shown superiority in change detection (CD) tasks, notably the Transformer architecture with its self-attention mechanism, capturing long-range dependencies and outperforming traditional models. This capability provides the Transformer with significant advantages in capturing global-level features of complex changes in objects within high-resolution remote sensing images. Though Transformers are mature in Natural Language Processing (NLP), their application in computer vision, particularly CD tasks, is nascent. Current research on leveraging Transformers for CD reveals limitations, especially under varied lighting and seasonal changes. To address this, we propose VisionTwinNet, a two-stage strategy. First, our Gated EnhanceClearNet, a specially designed deep network reduces image noise and enhances brightness, preserving shadows and correcting color distortions. With its unique gating mechanism, this network can adaptively adjust the importance of features, thereby exhibiting superior performance in various remote sensing image degradation issues. Secondly, we have developed Hybrid Light-Robust CDNet, a hybrid robust lightweight network custom-designed for CD in remote sensing images. This module deeply integrates the advantages of CNN and Transformer and introduces an innovative attention mechanism design, optimizing the key/value dimensions separately, instead of adopting traditional single linear transformations, ensuring efficient detection. Specifically, the LR-Transformer Block employs a lightweight multi-head self-attention mechanism, optimizing computational efficiency while providing richer feature representations. Comparative studies with six CD methods on three public datasets validate VisionTwinNet’s robustness and efficacy. Our approach notably reduces algorithmic complexity and enhances the efficiency of the model.

Soft-TransFormers for Continual Learning

Progressive Learning without Forgetting

Forget-free Continual Learning with Soft-Winning SubNetworks

Task-Attentive Transformer Architecture for Continual Learning of Vision-and-Language Tasks Using Knowledge Distillation

Dynamic Transformer Architecture for Continual Learning of Multimodal Tasks

Continual Learning: Forget-free Winning Subnetworks for Video Representations

FCL-ViT: Task-Aware Attention Tuning for Continual Learning

Remembering Transformer for Continual Learning

Online Continual Learning with Contrastive Vision Transformer

A Cognition-Driven Framework for Few-Shot Class-Incremental Learning

Continual Learning via Learning a Continual Memory in Vision Transformer

Sub-network Discovery and Soft-masking for Continual Learning of Mixed Tasks

ICL-TSVD: Bridging Theory and Practice in Continual Learning with Pre-trained Models

Exemplar-Free Continual Transformer with Convolutions

VisionTwinNet: Gated Clarity Enhancement Paired With Light-Robust CD Transformers

FeTT: Continual Class Incremental Learning via Feature Transformation Tuning

Continual HyperTransformer: A Meta-Learner for Continual Few-Shot Learning

Dual Low-Rank Adaptation for Continual Learning with Pre-Trained Models

Video Class-Incremental Learning with Clip Based Transformer

Achieving Forgetting Prevention and Knowledge Transfer in Continual Learning

TransCL: Transformer Makes Strong and Flexible Compressive Learning