Abstract:In this chapter we present a method for learning a compositional model in a minimax entropy framework for modeling object categories with large intra-class variance. The model we learn incorporates the flexibility of a stochastic context free grammar (SCFG) to account for the variation in object structure with the neighborhood constraints of a Markov random field (MRF) to enforce spatial context. We learn the model through a generalized minimax entropy framework that accounts for the dynamic structure of the hierarchical model. We first learn the SCFG parameters using the frequencies of object parts, then pursue spatial relations in order of greatest information gain. The learned model can generalize from a small set of training samples (n < 100) to generate a combinatorially large number of novel instances using stochastic sampling. To verify our learning method and model performance, we present plots of KL divergence minimization as the algorithm proceeds, and show that samples from the model become more realistic as more spatial relations are added. We also show the model accurately predicting missing or undetected parts for top-down recognition along with preliminary results showing that the model can learn a large space of category appearances from a very small (n < 15) number of training samples. This process is similar to “recognition-by-components”, a theory that postulates that biological vision systems recognize objects as composed from a dictionary of commonly appearing 3D structures. Finally, we discuss a compositional boosting algorithm for inference and show examples using it for object recognition. This article is a chapter from the book Object Categorization: Computer and Human Vision Perspectives, edited by Sven Dickinson, Ales Leonardis, Bernt Schiele, and Michael J. Tarr (Cambridge University Press). University of California Los Angeles, Los Angeles, CA. Lotus Hill Research Institute, EZhou, China.

Flexible Compositional Learning of Structured Visual Concepts

Compositional diversity in visual concept learning

Compositional learning of functions in humans and machines

Compositional Substitutivity of Visual Reasoning for Visual Question Answering

Compositional Learning of Visually-Grounded Concepts Using Reinforcement

Learning Unseen Concepts Via Hierarchical Decomposition and Composition

Exploring the Effect of Primitives for Compositional Generalization in Vision-and-Language

A Study of Compositional Generalization in Neural Models

Learning Visual Composition through Improved Semantic Guidance

Towards Compositionality in Concept Learning

Compositional Zero-shot Learning Via Progressive Language-based Observations

Constellation: Learning relational abstractions over objects for compositional imagination

Learning to generalize to new compositions in image understanding

Improving Compositional Generalization Using Iterated Learning and Simplicial Embeddings

Concepts, Properties and an Approach for Compositional Generalization

Towards a Unified Compositional Model for Visual Pattern Modeling

A Survey on Compositional Learning of AI Models: Theoretical and Experimental Practices

Human few-shot learning of compositional instructions

Compositional Entailment Learning for Hyperbolic Vision-Language Models

Learning Compositional Models for Object Categories from Small Sample Sets