Abstract:This dissertation presents a composite template model, named And-Or graph for representing objects with large structural variabilities. Intuitively, an And-node represents a decomposition of certain graphical structures which expands to a set of Or-nodes with associated relations; an Or-node serves as a set of switch variable pointing to alternative And-nodes. A traversal from the root node of the And-Or graph, named the parse graph, produces a configuration of the terminal nodes (sub-templates) under (soft and hard) relations inherited from their ancestor nodes. The And-Or graph representation can generate a large set of constrained configurations with relatively small number of graph nodes, thus account for great structural variations. The And-Or graph model is tested on tasks as modeling and sketching human faces and clothes. A hierarchical-compositional model of human faces, as a three-layer And-Or graph is built. Faces are represented hierarchically: the first layer treats each face as a whole; the second layer refines the local facial parts jointly as a set of individual templates; the third layer further divides face into 16 zones and models detail facial features such as eye corners, marks or wrinkles. Transitions between the layers are realized by measuring the minimum description length (MDL) given the complexity of an input face image. Diverse face representations are formed by drawing from dictionaries of global faces, parts and skin detail features. A sketch captures the most informative part of a face in a much more concise and potentially robust representation. However, generating good facial sketches is extremely challenging because of the rich facial details and large structural variations, especially in the high-resolution images. The representing power of our generative model is demonstrated by reconstructing high-resolution face images and generating the cartoon facial sketches. Our model is useful for a wide variety of applications, including recognition, non-photorealistic rendering, super-resolution, and low-bit rate face coding. Cloth modeling and recognition is an important and challenging problem in both vision and graphics tasks, such as dressed human recognition and tracking, human sketch and portrait. We built a And-Or graph model to represent different clothes configurations, such as T-shirts, jackets, etc. In a supervised learning phase, we ask an artist to draw sketches on a set of dressed people, and we decompose the sketches into categories of cloth and body components: collars, shoulders, cuff, hands, pants, shoes, etc. Each component has a number of distinct sub-templates (sub-graphs). An algorithm which integrates the bottom-up proposals and the top-down information is proposed to infer the composite clothes template efficiently from the image.

Unsupervised Learning of Stochastic AND-OR Templates for Object Modeling.

Learning And-Or Templates for Object Recognition and Detection

Learning and-or templates for object recognition by information projection

Discriminatively Trained And-Or Tree Models for Object Detection

Dynamical And-Or Graph Learning for Object Shape Modeling and Detection

Learning 3D Object Templates by Quantizing Geometry and Appearance Spaces

A hierarchical compositional model for representation and sketching of high-resolution human images

Unsupervised Structure Learning of Stochastic And-Or Grammars

Object Category Recognition Using Generative Template Boosting.

Towards a Unified Compositional Model for Visual Pattern Modeling

Unsupervised Learning of Dictionaries of Hierarchical Compositional Models

Inducing Hierarchical Compositional Model by Sparsifying Generator Network

Learning AND-OR Templates for Professional Photograph Parsing and Guidance

Modeling Occlusion by Discriminative AND-OR Structures

A hierarchical and contextual model for learning and recognizing highly variant visual categories

A Stochastic Grammar of Images

Integrating Context And Occlusion For Car Detection By Hierarchical And-Or Model

Online Object Tracking, Learning and Parsing with And-Or Graphs

A Numerical Study of the Bottom-Up and Top-Down Inference Processes in And-Or Graphs.