Abstract:Traditional deep learning interpretability methods which are suitable for model users cannot explain network behaviors at the global level and are inflexible at providing fine-grained explanations. As a solution, concept-based explanations are gaining attention due to their human intuitiveness and their flexibility to describe both global and local model behaviors. Concepts are groups of similarly meaningful pixels that express a notion, embedded within the network's latent space and have commonly been hand-generated, but have recently been discovered by automated approaches. Unfortunately, the magnitude and diversity of discovered concepts makes it difficult to navigate and make sense of the concept space. Visual analytics can serve a valuable role in bridging these gaps by enabling structured navigation and exploration of the concept space to provide concept-based insights of model behavior to users. To this end, we design, develop, and validate ConceptExplainer, a visual analytics system that enables people to interactively probe and explore the concept space to explain model behavior at the instance/class/global level. The system was developed via iterative prototyping to address a number of design challenges that model users face in interpreting the behavior of deep learning models. Via a rigorous user study, we validate how ConceptExplainer supports these challenges. Likewise, we conduct a series of usage scenarios to demonstrate how the system supports the interactive analysis of model behavior across a variety of tasks and explanation granularities, such as identifying concepts that are important to classification, identifying bias in training data, and understanding how concepts can be shared across diverse and seemingly dissimilar classes.

ConceptX: A Framework for Latent Concept Analysis

NxPlain: Web-based Tool for Discovery of Latent Concepts

Latent Concept-based Explanation of NLP Models

Analyzing Encoded Concepts in Transformer Language Models

ConceptExplainer: Interactive Explanation for Deep Neural Networks from a Concept Perspective

Human-in-the-loop Extraction of Interpretable Concepts in Deep Learning Models

Concept Induction using LLMs: a user experiment for assessment

LLM-assisted Concept Discovery: Automatically Identifying and Explaining Neuron Functions

Sparse Linear Concept Discovery Models

A Concept-Based Explainability Framework for Large Multimodal Models

Scaling up Discovery of Latent Concepts in Deep NLP Models

Explaining Language Models' Predictions with High-Impact Concepts

Global Concept Explanations for Graphs by Contrastive Learning

ConLUX: Concept-Based Local Unified Explanations

Abstracting Deep Neural Networks into Concept Graphs for Concept Level Interpretability

Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery

From Attribution Maps to Human-Understandable Explanations through Concept Relevance Propagation

Concept Embedding Analysis: A Review

A Language Model based Framework for New Concept Placement in Ontologies

Concept Bottleneck Language Models For protein design

A Self-explaining Neural Architecture for Generalizable Concept Learning