Abstract:Hyperspectral target detection (HTD) aims to identify specific materials based on spectral information in hyperspectral imagery and can detect point targets, some of which occupy a smaller than one-pixel area. However, existing HTD methods are developed based on per-pixel binary classification, which limits the feature representation capability for point targets. In this paper, we rethink the hyperspectral point target detection from the object detection perspective, and focus more on the object-level prediction capability rather than the pixel classification capability. Inspired by the token-based processing flow of Detection Transformer (DETR), we propose the first specialized network for hyperspectral multi-class point object detection, SpecDETR. Without the backbone part of the current object detection framework, SpecDETR treats the spectral features of each pixel in hyperspectral images as a token and utilizes a multi-layer Transformer encoder with local and global coordination attention modules to extract deep spatial-spectral joint features. SpecDETR regards point object detection as a one-to-many set prediction problem, thereby achieving a concise and efficient DETR decoder that surpasses the current state-of-the-art DETR decoder in terms of parameters and accuracy in point object detection. We develop a simulated hyperSpectral Point Object Detection benchmark termed SPOD, and for the first time, evaluate and compare the performance of current object detection networks and HTD methods on hyperspectral multi-class point object detection. SpecDETR demonstrates superior performance as compared to current object detection networks and HTD methods on the SPOD dataset. Additionally, we validate on a public HTD dataset that by using data simulation instead of manual annotation, SpecDETR can detect real-world single-spectral point objects directly.

What problem does this paper attempt to address?

This paper focuses on the problem of hyperspectral point object detection. Existing methods for hyperspectral object detection are based on pixel binary classification, which limits the representation capability for point object features. The researchers re-examined this problem and approached it from the perspective of object detection, with a focus on object-level prediction rather than pixel-level classification. Inspired by the Detection Transformer (DETR), they proposed the first network specifically designed for hyperspectral multi-class point object detection called SpecDETR. SpecDETR does not use the backbone network in traditional object detection frameworks. Instead, it treats the spectral features of each pixel in the hyperspectral image as a token and extracts joint deep spatial-spectral features through a multi-layer Transformer encoder. It treats point object detection as a one-to-one-to-many set prediction problem and achieves a concise and efficient DETR decoder that surpasses the current DETR decoder in terms of parameters and accuracy. To evaluate and compare the performance of current object detection networks and HTD methods in hyperspectral multi-class point object detection, they constructed a benchmark called SPOD. On the SPOD dataset, SpecDETR demonstrates superior performance compared to current object detection networks and HTD methods, particularly in detecting sub-pixel objects with extremely low spectral abundances. Moreover, by using data simulation instead of manual annotation, SpecDETR can directly detect real-world single-spectral point objects. In summary, this paper addresses the issue of insufficient feature representation for small targets in existing hyperspectral point object detection methods. It proposes a new Transformer-based detection network, improving the detection capability for sub-pixel and small targets.

SpecDETR: A Transformer-based Hyperspectral Point Object Detection Network

LSH-DETR: Object Detection Algorithm for Marine Benthic Organisms Based on Improved DETR

Spatial-Temporal Graph Enhanced DETR Towards Multi-Frame 3D Object Detection

DETR-ORD: An Improved DETR Detector for Oriented Remote Sensing Object Detection with Feature Reconstruction and Dynamic Query

DPDETR: Decoupled Position Detection Transformer for Infrared-Visible Object Detection

An Improved DETR Based on Angle Denoising and Oriented Boxes Refinement for Remote Sensing Object Detection

PR-Deformable DETR: DETR for Remote Sensing Object Detection

SAP-DETR: Bridging the Gap Between Salient Points and Queries-Based Transformer Detector for Fast Model Convergency

Detrex: Benchmarking Detection Transformers

KeypointDETR: an End-to-End 3D Keypoint Detector

Rank-DETR for High Quality Object Detection

DISTILLING DETR-LIKE DETECTORS WITH INSTANCE-AWARE FEATURE

DFS-DETR: Detailed-Feature-Sensitive Detector for Small Object Detection in Aerial Images Using Transformer

V-DETR: DETR with Vertex Relative Position Encoding for 3D Object Detection

Point-DETR3D: Leveraging Imagery Data with Spatial Point Prior for Weakly Semi-supervised 3D Object Detection

Anchor DETR: Query Design for Transformer-Based Detector

DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer

D2Q-DETR: Decoupling and Dynamic Queries for Oriented Object Detection with Transformers

L-DETR: A Light-Weight Detector for End-to-End Object Detection With Transformers

Deformable DETR: Deformable Transformers for End-to-End Object Detection

Transformer Based Remote Sensing Object Detection with Enhanced Multispectral Feature Extraction.