Exploiting Contextual Uncertainty of Visual Data for Efficient Training of Deep Models

Sharat Agarwal

2024-11-20

Abstract:Objects, in the real world, rarely occur in isolation and exhibit typical arrangements governed by their independent utility, and their expected interaction with humans and other objects in the context. For example, a chair is expected near a table, and a computer is expected on top. Humans use this spatial context and relative placement as an important cue for visual recognition in case of ambiguities. Similar to human's, DNN's exploit contextual information from data to learn representations. Our research focuses on harnessing the contextual aspects of visual data to optimize data annotation and enhance the training of deep networks. Our contributions can be summarized as follows: (1) We introduce the notion of contextual diversity for active learning CDAL and show its applicability in three different visual tasks semantic segmentation, object detection and image classification, (2) We propose a data repair algorithm to curate contextually fair data to reduce model bias, enabling the model to detect objects out of their obvious context, (3) We propose Class-based annotation, where contextually relevant classes are selected that are complementary for model training under domain shift. Understanding the importance of well-curated data, we also emphasize the necessity of involving humans in the loop to achieve accurate annotations and to develop novel interaction strategies that allow humans to serve as fact-checkers. In line with this we are working on developing image retrieval system for wildlife camera trap images and reliable warning system for poor quality rural roads. For large-scale annotation, we are employing a strategic combination of human expertise and zero-shot models, while also integrating human input at various stages for continuous feedback.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to optimize the training of deep models by leveraging the contextual uncertainty in visual data. Specifically, the paper focuses on the following aspects: 1. **Introducing Contextual Diversity**: The paper proposes the concept of "Contextual Diversity (CD)" and demonstrates its application in three different visual tasks: semantic segmentation, object detection, and image classification. This concept aims to select training data with diverse contexts to improve the generalization ability of the model. 2. **Data Repair Algorithm**: To reduce model bias, the paper proposes a data repair algorithm for generating context - fair data sets. This enables the model to better detect objects appearing in atypical backgrounds. 3. **Category - based Labeling Strategy**: The paper proposes a category - based labeling method, selecting context categories relevant to the current task for labeling to address the domain transfer problem. This method helps the model generalize better in new domains. 4. **Human - Machine Collaborative Labeling**: The paper emphasizes the importance of introducing human feedback in the data - labeling process, especially in large - scale labeling tasks. By combining the knowledge of human experts and zero - shot models, the labeling task can be completed more efficiently. Overall, this paper aims to improve the performance and robustness of deep - learning models in visual tasks by optimizing the quality and diversity of data and introducing human - machine collaborative methods.

Exploiting Contextual Uncertainty of Visual Data for Efficient Training of Deep Models

Context-aware Feature Reconstruction for Class-Incremental Anomaly Detection and Localization

Does Data Repair Lead to Fair Models? Curating Contextually Fair Data To Reduce Model Bias

Spatially-Aware Context Neural Networks.

Learning Deep Context-Network Architectures for Image Annotation

Exploiting Contextual Objects and Relations for 3D Visual Grounding.

Exploiting Semantic And Visual Context For Effective Video Annotation

Deep Context-Aware Kernel Networks

Exploiting Spatial Context Constraints for Automatic Image Region Annotation.

Modeling Visual Context is Key to Augmenting Object Detection Datasets

Contextual Debiasing for Visual Recognition with Causal Mechanisms

Addressing Training Bias via Automated Image Annotation

Where Can We Help? A Visual Analytics Approach to Diagnosing and Improving Semantic Segmentation of Movable Objects

CCA: Exploring the Possibility of Contextual Camouflage Attack on Object Detection

Context Dependent SVMs for Interconnected Image Network Annotation

Putting visual object recognition in context

Exploring Context with Deep Structured models for Semantic Segmentation

Context-Adaptive Deep Neural Networks via Bridge-Mode Connectivity

Context Augmentation for Convolutional Neural Networks

Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation

Exploiting Web Images for Fine-Grained Visual Recognition by Eliminating Open-Set Noise and Utilizing Hard Examples