Cross-View Human Intention Recognition for Human-Robot Collaboration
Shouxiang Ni,Lindong Zhao,Ang Li,Dan Wu,Liang Zhou
DOI: https://doi.org/10.1109/mwc.018.2200514
IF: 12.777
2023-07-15
IEEE Wireless Communications
Abstract:Benefiting from the promise of sixth generation (6G) wireless networks, multimodal machine learning based on exploiting complementarity among video, audio, and haptic signals, becomes a key enabler for human intention recognition, which is critical to realize effective human-robot collaboration in Industry 4.0 scenarios. However, as multimodal human intention recognition is limited by expensive equipment and a demanding environment, it is hard to strike an efficient trade-off between inference accuracy and system overhead. Naturally, how to induce more intention semantics from readily available videos emerges as a fundamental issue for human intention recognition. In this article, we use cross-view human intention recognition to solve the above issue and demonstrate the effectiveness of our method with well-designed evaluation metrics. Specifically, we first compensate for the scarcity of intention semantics in the body view by adding a face view. Second, we deploy the cross-view generative model to capture intention semantics induced by the mutual generation of two views. Finally, in the human-robot collaboration experiments, our method gets closer to human performance regarding response time and inference accuracy.
computer science, information systems,telecommunications,engineering, electrical & electronic, hardware & architecture