Real-Time Multi-modal Human-Robot Collaboration Using Gestures and Speech

Haodong Chen,Zhaozheng Yin,Ming C Leu
DOI: https://doi.org/10.1115/1.4054297
2022-04-08
Journal of Manufacturing Science and Engineering
Abstract:Abstract As artificial intelligence and industrial automation are developing, human-robot collaboration (HRC) with advanced interaction capabilities has become an increasingly significant area of research. In this paper, we design and develop a real-time, multi-model HRC system using speech and gestures. A set of sixteen dynamic gestures is designed for communication from a human to an industrial robot. A data set of dynamic gestures is designed and constructed, and it will be shared with the community. A convolutional neural network (CNN) is developed to recognize the dynamic gestures in real time using the Motion History Image (MHI) and deep learning methods. An improved open-source speech recognizer is used for real-time speech recognition of the human worker. An integration strategy is proposed to integrate the gesture and speech recognition results, and a software interface is designed for system visualization. A multi-threading architecture is constructed for simultaneously operating multiple tasks, including gesture and speech data collection and recognition, data integration, robot control, and software interface operation. The various methods and algorithms are integrated to develop the HRC system, with a platform constructed to demonstrate the system performance. The experimental results validate the feasibility and effectiveness of the proposed algorithms and the HRC system.
What problem does this paper attempt to address?