Abstract:Deep Neural Networks (DNNs), despite their notable progress across information retrieval tasks, encounter the issues of shortcut learning and struggle with poor generalization due to their reliance on spurious correlations between features and labels. Current research mainly mitigates shortcut learning behavior using augmentation and distillation techniques, but these methods could be laborious and introduce unwarranted biases. To tackle these, in this paper, we propose COMI, a novel method to COrrect and MItigate shortcut learning behavior. Inspired by the ways students solve shortcuts in educational scenarios, we aim to reduce model's reliance on shortcuts and enhance its ability to extract underlying information integrated with standard Empirical Risk Minimization (ERM). Specifically, we first design Correct Habit (CoHa) strategy to retrieve the top.. challenging samples for priority training, which encourages model to rely less on shortcuts in the early training. Then, to extract more meaningful underlying information, the information derived from ERM is separated into task-relevant and task-irrelevant information, the former serves as the primary basis for model predictions, while the latter is considered non-essential. However, within task-relevant information, certain potential shortcuts contribute to overconfident predictions. To mitigate this, we design Deep Mitigation (DeMi) network with shortcut margin loss to adaptively control the feature weights of shortcuts and eliminate their influence. Besides, to counteract unknown shortcut tokens issue in NLP, we adopt locally interpretable module-LIME to help recognize shortcut tokens. Finally, extensive experiments conducted on NLP and CV tasks demonstrate the effectiveness of COMI, which can perform well on both IID and OOD samples.

Monitoring Shortcut Learning using Mutual Information

Shortcut learning in deep neural networks

Navigating Shortcuts, Spurious Correlations, and Confounders: From Origins via Detection to Mitigation

Detecting shortcut learning for fair medical AI using shortcut testing

Benchmarking Dependence Measures to Prevent Shortcut Learning in Medical Imaging

COMI: COrrect and MItigate Shortcut Learning Behavior in Deep Neural Networks

How to Construct Perfect and Worse-than-Coin-Flip Spoofing Countermeasures: A Word of Warning on Shortcut Learning

Be Persistent: Towards a Unified Solution for Mitigating Shortcuts in Deep Learning

A Low-cost Strategic Monitoring Approach for Scalable and Interpretable Error Detection in Deep Neural Networks

Data-Efficient Mutual Information Neural Estimator

Shortcut Detection with Variational Autoencoders

Learning Concept Credible Models for Mitigating Shortcuts

Mutual Information Analysis in Multimodal Learning Systems

A Benchmark Suite for Evaluating Neural Mutual Information Estimators on Unstructured Datasets

Patch Shortcuts: Interpretable Proxy Models Efficiently Find Black-Box Vulnerabilities

A robust estimator of mutual information for deep learning interpretability

Dissecting Deep Learning Networks—Visualizing Mutual Information

Shortcut Learning in In-Context Learning: A Survey

Deephys: Deep Electrophysiology, Debugging Neural Networks under Distribution Shifts

Navigating the Shortcut Maze: A Comprehensive Analysis of Shortcut Learning in Text Classification by Language Models