ConVaT: A Variational Generative Transformer With Momentum Contrastive Learning for Hyperspectral Image Classification

Miaomiao Liang,Zuo Liu,Jian Dong,Lingjuan Yu,Xiangchun Yu,Jun Li,Licheng Jiao
DOI: https://doi.org/10.1109/lgrs.2024.3367814
IF: 5.343
2024-03-02
IEEE Geoscience and Remote Sensing Letters
Abstract:Hyperspectral images provide plentiful latent information that requires exploration for ground object recognition, where self-supervised learning (SSL) is efficient and independent of manual labeling. However, the severe spectral uncertainty poses a significant challenge in discriminative and generalizable representation by self-supervision. This letter proposes a variational generative transformer (VGT) with momentum contrastive supervision (ConVaT) to alleviate the problem. ConVaT contains two branches: a variational generative branch and a contrastive learning branch—the former guides informative data representation via an encoder–decoder transformer with variational inference; the latter encourages the representation with discriminability by distinguishing positive anchors from negative ones. Significantly, to facilitate a more generalizable latent representation, we reconstruct data with reparameterized tokens sampled multiple times from the global anchor, instead of the latent representation of unmasking data. Extensive experiments on three public datasets show that ConVaT is superior in data representation with intraclass clustering and interclass distinction, and it achieves considerable improvements over present methods under linear probing, especially for the Indian pines (IP) dataset with intense spectral uncertainty. Our code will be available at https://github.com/liuzuo-byte/ConVaT.
imaging science & photographic technology,remote sensing,engineering, electrical & electronic,geochemistry & geophysics
What problem does this paper attempt to address?