Imitation Learning Inputting Image Feature to Each Layer of Neural Network

Koki Yamane,Sho Sakaino,Toshiaki Tsuji

2024-01-19

Abstract:Imitation learning enables robots to learn and replicate human behavior from training data. Recent advances in machine learning enable end-to-end learning approaches that directly process high-dimensional observation data, such as images. However, these approaches face a critical challenge when processing data from multiple modalities, inadvertently ignoring data with a lower correlation to the desired output, especially when using short sampling periods. This paper presents a useful method to address this challenge, which amplifies the influence of data with a relatively low correlation to the output by inputting the data into each neural network layer. The proposed approach effectively incorporates diverse data sources into the learning process. Through experiments using a simple pick-and-place operation with raw images and joint information as input, significant improvements in success rates are demonstrated even when dealing with data from short sampling periods.

Robotics,Artificial Intelligence,Machine Learning

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that when using data from multiple modalities for imitation learning, the weakly correlated parts among the data are easily ignored. Specifically, when simultaneously inputting data from multiple modalities, the data with strong correlation to the output will be focused on by the model, while the data with weak correlation to the output may be neglected. Especially in the case of a short sampling period, this phenomenon is more obvious. For example, when predicting the next joint angle of a robot, if the current - step joint angle and the image are used as inputs, since the current - step joint angle has a strong correlation with the next - step joint angle, especially when the sampling period is very short, even if the current - step joint angle is directly used as the output, a small prediction error can be achieved, resulting in a very small influence of other inputs (such as the image) on the output. To solve this problem, the author proposes a method to increase the influence of these data on the output by inputting the weakly correlated data into each layer of the neural network. The experimental results show that this method can significantly improve the success rate of tasks when dealing with data with a short sampling period, especially for simple grasping and placing operations. When using the original image and joint information as inputs, the effect is particularly obvious.

Imitation Learning Inputting Image Feature to Each Layer of Neural Network

Multimodal integration learning of robot behavior using deep neural networks

Learning One-Shot Imitation From Humans Without Humans

One-Shot Visual Imitation Learning via Meta-Learning

Cortex Inspired In-Place Learning Networks For General Invariance And Multi-Tasks

Rethinking Deep Learning: Non-backpropagation and Non-optimization Machine Learning Approach Using Hebbian Neural Networks

Deep Internal Learning: Deep learning from a single input

Rethinking Deep Learning: Propagating Information in Neural Networks without Backpropagation and Statistical Optimization

Resolving Copycat Problems in Visual Imitation Learning Via Residual Action Prediction

One-Shot Domain-Adaptive Imitation Learning via Progressive Learning

A Multilayer In-Place Learning Network For Development Of General Invariances

Learning Deep Features for Robotic Inference from Physical Interactions.

One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning

Transformers for One-Shot Visual Imitation

Transformer-based deep imitation learning for dual-arm robot manipulation

Resolving Copycat Problems in Visual Imitation Learning via Residual Action Prediction

Seamless Integration and Coordination of Cognitive Skills in Humanoid Robots: A Deep Learning Approach

Multilayer In-Place Learning Networks: Multitask Invariance And Adaptive Lateral Connections

An Algorithmic Perspective on Imitation Learning

Imitation Learning for Object Manipulation Based on Position/Force Information Using Bilateral Control