Abstract:State-of-the-art methods on sketch classification and retrieval are based on deep convolutional neural network to learn representations. Although deep neural networks have the ability to model images with hierarchical representations by convolution kernels, they can not automatically extract the structural representations of object categories in a human-perceptible way. Furthermore, sketch images usually have large scale visual variations caused by the styles of drawing or viewpoints, which make it difficult to develop generalized representations using the fixed computational mode of convolutional kernel. In this paper, our aim is to address the problem of fixed computational mode in feature extraction process without extra supervision. We propose a novel architecture to dynamically discover the object landmarks and learn the discriminative structural representations. Our model is composed of two components: a representative landmark discovering module that localizes the key points on the object, and a category-aware representation learning module that develops the category-specific features. Specifically, we develop a structure-aware offset layer to dynamically localize the representative landmarks, which is optimized based on the category labels without extra supervision. After that, a diversity branch is introduced to extract the global discriminative features for each category. Finally, we employ a multi-task loss function to develop an end-to-end trainable architecture. At testing time, we fuse all the predictions with different number of landmarks to achieve the final results. Through extensive experiments, we compare our model with several state-of-the-art methods on two challenging datasets TU-Berlin and Sketchy for sketch classification and retrieval, and the experimental results demonstrate the effectiveness of our proposed model.

Discovering Predictive Relational Object Symbols With Symbolic Attentive Layers

Learning Multi-Object Symbols for Manipulation with Attentive Deep Effect Predictors

Symbolic Manipulation Planning with Discovered Object and Relational Predicates

Identification of Unmodeled Objects from Symbolic Descriptions

Language-guided Adaptive Perception with Hierarchical Symbolic Representations for Mobile Manipulators

Vision-Action Semantic Associative Learning Based on Spiking Neural Networks for Cognitive Robot.

RESOLVE: Relational Reasoning with Symbolic and Object-Level Features Using Vector Symbolic Processing

Enhancing Interpretability and Interactivity in Robot Manipulation: A Neurosymbolic Approach

Object-centric proto-symbolic behavioural reasoning from pixels

LARS-VSA: A Vector Symbolic Architecture For Learning with Abstract Rules

Modeling Long-horizon Tasks as Sequential Interaction Landscapes

Latent Space Planning for Multi-Object Manipulation with Environment-Aware Relational Classifiers

Learning Symbolic Task Representation from a Human-Led Demonstration: A Memory to Store, Retrieve, Consolidate, and Forget Experiences

Leveraging Recursive Processing for Neural-Symbolic Affect-Target Associations

On the Transition from Neural Representation to Symbolic Knowledge

Interpretable Latent Spaces for Learning from Demonstration

Latent Space Planning for Multiobject Manipulation With Environment-Aware Relational Classifiers

Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing

Discovering Robotic Interaction Modes with Discrete Representation Learning

Learning Structural Representations via Dynamic Object Landmarks Discovery for Sketch Recognition and Retrieval

Multi-Agent Dynamic Relational Reasoning for Social Robot Navigation