Multisensory Learning Framework for Robot Drumming

A. Barsky,C. Zito,H. Mori,T. Ogata,J. L. Wyatt
DOI: https://doi.org/10.48550/arXiv.1907.09775
2019-07-23
Abstract:The hype about sensorimotor learning is currently reaching high fever, thanks to the latest advancement in deep learning. In this paper, we present an open-source framework for collecting large-scale, time-synchronised synthetic data from highly disparate sensory modalities, such as audio, video, and proprioception, for learning robot manipulation tasks. We demonstrate the learning of non-linear sensorimotor mappings for a humanoid drumming robot that generates novel motion sequences from desired audio data using cross-modal correspondences. We evaluate our system through the quality of its cross-modal retrieval, for generating suitable motion sequences to match desired unseen audio or video sequences.
Robotics,Computer Vision and Pattern Recognition,Sound
What problem does this paper attempt to address?