Abstract:From an early age, humans learn to develop an intuition for the physical nature of the objects around them by using exploratory behaviors. Such exploration provides observations of how objects feel, sound, look, and move as a result of actions applied on them. Previous works in robotics have shown that robots can also use such behaviors (e.g., lifting, pressing, shaking) to infer object properties that camera input alone cannot detect. Such learned representations are specific to each individual robot and cannot currently be transferred directly to another robot with different sensors and actions. Moreover, sensor failure can cause a robot to lose a specific sensory modality which may prevent it from using perceptual models that require it as input. To address these limitations, we propose a framework for knowledge transfer across behaviors and sensory modalities such that: (1) knowledge can be transferred from one or more robots to another, and, (2) knowledge can be transferred from one or more sensory modalities to another. We propose two different models for transfer based on variational auto-encoders and encoder-decoder networks. The main hypothesis behind our approach is that if two or more robots share multi-sensory object observations of a shared set of objects, then those observations can be used to establish mappings between multiple features spaces, each corresponding to a combination of an exploratory behavior and a sensory modality. We evaluate our approach on a category recognition task using a dataset in which a robot used 9 behaviors, coupled with 4 sensory modalities, performed multiple times on 100 objects. The results indicate that sensorimotor knowledge about objects can be transferred both across behaviors and across sensory modalities, such that a new robot (or the same robot, but with a different set of sensors) can bootstrap its category recognition models without having to exhaustively explore the full set of objects.

Multimodal integration learning of robot behavior using deep neural networks

Seamless Integration and Coordination of Cognitive Skills in Humanoid Robots: A Deep Learning Approach

Multimodal representation models for prediction and control from partial information

Multimodel Sensor Fusion for Learning Rich Models for Interacting Soft Robots

Training an Interactive Humanoid Robot Using Multimodal Deep Reinforcement Learning

Deep Learning-based Multimodal Control Interface for Human-Robot Collaboration

Multisensory Learning Framework for Robot Drumming

Learning Deep Features for Robotic Inference from Physical Interactions.

DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning

Enhancing Human–Robot Collaboration through a Multi-Module Interaction Framework with Sensor Fusion: Object Recognition, Verbal Communication, User of Interest Detection, Gesture and Gaze Recognition

Multimodal Sensors and ML‐Based Data Fusion for Advanced Robots

Learning Multimodal Confidence for Intention Recognition in Human-Robot Interaction

Robot gains Social Intelligence through Multimodal Deep Reinforcement Learning

Brain-inspired Multimodal Learning Based on Neural Networks

A Multimodal Information Fusion Model for Robot Action Recognition with Time Series

Multimodal audio-visual robot fusing 3D CNN and CRNN for player behavior recognition and prediction in basketball matches

Bioinspired multisensory neural network with crossmodal integration and recognition

Decision Making of Mobile Robot based on Multimodal Fusion

DeepInteraction++: Multi-Modality Interaction for Autonomous Driving

Learning Sensor Models for Autonomous Information Fusion on a Humanoid Robot.

A Framework for Sensorimotor Cross-Perception and Cross-Behavior Knowledge Transfer for Object Categorization