Abstract:Influence function, a technique rooted in robust statistics, has been adapted in modern machine learning for a novel application: data attribution -- quantifying how individual training data points affect a model's predictions. However, the common derivation of influence functions in the data attribution literature is limited to loss functions that can be decomposed into a sum of individual data point losses, with the most prominent examples known as M-estimators. This restricts the application of influence functions to more complex learning objectives, which we refer to as non-decomposable losses, such as contrastive or ranking losses, where a unit loss term depends on multiple data points and cannot be decomposed further. In this work, we bridge this gap by revisiting the general formulation of influence function from robust statistics, which extends beyond M-estimators. Based on this formulation, we propose a novel method, the Versatile Influence Function (VIF), that can be straightforwardly applied to machine learning models trained with any non-decomposable loss. In comparison to the classical approach in statistics, the proposed VIF is designed to fully leverage the power of auto-differentiation, hereby eliminating the need for case-specific derivations of each loss function. We demonstrate the effectiveness of VIF across three examples: Cox regression for survival analysis, node embedding for network analysis, and listwise learning-to-rank for information retrieval. In all cases, the influence estimated by VIF closely resembles the results obtained by brute-force leave-one-out retraining, while being up to $10^3$ times faster to compute. We believe VIF represents a significant advancement in data attribution, enabling efficient influence-function-based attribution across a wide range of machine learning paradigms, with broad potential for practical use cases.

Influence-based Attributions can be Manipulated

Adversarial Attacks on Data Attribution

A Versatile Influence Function for Data Attribution with Non-Decomposable Loss

The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes

Influence Functions in Deep Learning Are Fragile

Generative neural networks for experimental manipulation: Examining dominance-trustworthiness face impressions with data-efficient models

Revisiting the Fragility of Influence Functions

Rethinking Robustness of Model Attributions

Intriguing Properties of Data Attribution on Diffusion Models

Influence Functions for Scalable Data Attribution in Diffusion Models

DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models

Explaining Black Box Predictions and Unveiling Data Artifacts through Influence Functions

Additive Feature Attribution Explainable Methods to Craft Adversarial Attacks for Text Classification and Text Regression

Adversarial Attack Attribution: Discovering Attributable Signals in Adversarial ML Attacks

InfFeed: Influence Functions as a Feedback to Improve the Performance of Subjective Tasks

Attribution-driven Causal Analysis for Detection of Adversarial Examples

Statistical and Computational Guarantees for Influence Diagnostics

Delta-Influence: Unlearning Poisons via Influence Functions

Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation

Disentangling Influence: Using Disentangled Representations to Audit Model Predictions