Abstract:Machine learning models deployed in open-world scenarios often encounter unfamiliar conditions and perform poorly in unanticipated situations. As AI systems advance and find application in safety-critical domains, effectively handling out-of-distribution (OOD) data is crucial to building open-world learning systems. In this work, we introduce ALOE, a novel active learning algorithm for open-world environments designed to enhance model adaptation by incorporating new OOD classes via a two-stage approach. First, diversity sampling selects a representative set of examples, followed by energy-based OOD detection to prioritize likely unknown classes for annotation. This strategy accelerates class discovery and learning, even under constrained annotation budgets. Evaluations on three long-tailed image classification benchmarks demonstrate that ALOE outperforms traditional active learning baselines, effectively expanding known categories while balancing annotation cost. Our findings reveal a crucial tradeoff between enhancing known-class performance and discovering new classes, setting the stage for future advancements in open-world machine learning.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to solve the challenges encountered by machine - learning models in open - world scenarios. Specifically, when these models are deployed in real - world environments, they often encounter unseen conditions and unforeseen situations, resulting in poor performance. Especially when applying AI systems in safety - critical fields, effectively handling out - of - distribution (OOD) data is crucial for building open - world machine - learning systems. The paper proposes a new active learning algorithm named ALOE (Active Learning in Open - world Environments), which is specifically designed to enhance model adaptability in open - world environments. By introducing energy - based OOD detection techniques, ALOE can accelerate the discovery and learning of new classes under a limited annotation budget, thereby improving the model's ability to recognize unknown classes and balancing the performance improvement of known and unknown classes. #### Main problems summarized: 1. **Handling OOD data**: Traditional machine - learning models usually assume that training and test data come from the same distribution, but in the real world, models will inevitably encounter previously unseen OOD data. 2. **Adaptability in open - world scenarios**: In the open world, models need to be able to dynamically recognize and learn new classes, not just recognize known classes. 3. **High - cost manual annotation**: Obtaining manual annotations for these new classes is often time - consuming and expensive, so effective strategies are required to reduce annotation costs. 4. **Class imbalance problem**: In long - tailed distribution datasets, random sampling cannot effectively discover rare classes, so a more intelligent selection strategy is needed. By combining diversity sampling and energy - based OOD detection, ALOE provides a comprehensive solution to address the above challenges, especially for open - world environments in multi - class classification tasks.

Deep Active Learning in the Open World

Open World Classification with Adaptive Negative Samples.

Detecting and Learning Out-of-Distribution Data in the Open world: Algorithm and Theory

Open Long-Tailed Recognition In A Dynamic World

Open-CRB: Towards Open World Active Learning for 3D Object Detection

From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects

Open-world Machine Learning: A Review and New Outlooks

An Online Active Broad Learning Approach for Real-Time Safety Assessment of Dynamic Systems in Nonstationary Environments

Learning to Augment Distributions for Out-of-Distribution Detection

OAL: Enhancing OOD Detection Using Latent Diffusion

Open-environment Machine Learning.

Open-world Learning and Application to Product Classification

Deep Active Learning via Open Set Recognition

Self-Supervised Features Improve Open-World Learning

Dynamic Against Dynamic: An Open-set Self-learning Framework

Out-of-Distribution Learning with Human Feedback

A Unified Approach Towards Active Learning and Out-of-Distribution Detection

Bidirectional Uncertainty-Based Active Learning for Open Set Annotation

Online Adaptive Asymmetric Active Learning With Limited Budgets

OW-Adapter: Human-Assisted Open-World Object Detection with a Few Examples

OpenAL: An Efficient Deep Active Learning Framework for Open-Set Pathology Image Classification