Abstract:Although deep neural networks have made significant progress in tasks related to remote sensing image scene classification, most of these tasks assume that the training and test data are independently and identically distributed. However, when remote sensing scene classification models are deployed in the real world, the model will inevitably encounter situations where the distribution of the test set differs from that of the training set, leading to unpredictable errors during the inference and testing phase. For instance, in the context of large-scale remote sensing scene classification applications, it is difficult to obtain all the feature classes in the training phase. Consequently, during the inference and testing phases, the model will categorize images of unidentified unknown classes into known classes. Therefore, the deployment of out-of-distribution (OOD) detection within the realm of remote sensing scene classification is crucial for ensuring the reliability and safety of model application in real-world scenarios. Despite significant advancements in OOD detection methods in recent years, there remains a lack of a unified benchmark for evaluating various OOD methods specifically in remote sensing scene classification tasks. We designed different benchmarks on three classical remote sensing datasets to simulate scenes with different distributional shift. Ten different types of OOD detection methods were employed, and their performance was evaluated and compared using quantitative metrics. Numerous experiments were conducted to evaluate the overall performance of these state-of-the-art OOD detection methods under different test benchmarks. The comparative results show that the virtual-logit matching methods without additional training outperform the other types of methods on our benchmarks, suggesting that additional training methods are unnecessary for remote sensing image scene classification applications. Furthermore, we provide insights into OOD detection models and performance enhancement in real world. To the best of our knowledge, this study is the first evaluation and analysis of methods for detecting out-of-distribution data in remote sensing. We hope that this research will serve as a fundamental resource for future studies on out-of-distribution detection in remote sensing.

Do-GOOD: Towards Distribution Shift Evaluation for Pre-Trained Visual Document Understanding Models

A Survey on Evaluation of Out-of-Distribution Generalization

How to Evaluate the Generalization of Detection? A Benchmark for Comprehensive Open-Vocabulary Detection

OpenOOD v1.5: Enhanced Benchmark for Out-of-Distribution Detection

OoD-Bench: Quantifying and Understanding Two Dimensions of Out-of-Distribution Generalization

TagOOD: A Novel Approach to Out-of-Distribution Detection via Vision-Language Representations and Class Center Learning

Open-Vocabulary Object Detectors: Robustness Challenges under Distribution Shifts

OOD-CV-v2: An extended Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images

OOD-CV-v2 : An extended Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images

Recent Advances in OOD Detection: Problems and Approaches

GeSS: Benchmarking Geometric Deep Learning under Scientific Applications with Distribution Shifts

Dissecting Out-of-Distribution Detection and Open-Set Recognition: A Critical Analysis of Methods and Benchmarks

Towards Out-Of-Distribution Generalization: A Survey

Effective Robustness against Natural Distribution Shifts for Models with Different Training Data

An Empirical Study on Distribution Shift Robustness from the Perspective of Pre-Training and Data Augmentation

Evaluation of Ten Deep-Learning-Based Out-of-Distribution Detection Methods for Remote Sensing Image Scene Classification

GLUE-X: Evaluating Natural Language Understanding Models from an Out-of-Distribution Generalization Perspective.

Out-of-Distribution Learning with Human Feedback

PViT: Prior-augmented Vision Transformer for Out-of-distribution Detection

Out-Of-Distribution Detection with Diversification (Provably)