Multi-View Active Sensing for Human-Robot Interaction via Hierarchically Connected Tree

Yuanjiong Ying,Xian Huang,Wei Dong
2024-03-19
Abstract:Comprehensive perception of human beings is the prerequisite to ensure the safety of human-robot interaction. Currently, prevailing visual sensing approach typically involves a single static camera, resulting in a restricted and occluded field of view. In our work, we develop an active vision system using multiple cameras to dynamically capture multi-source RGB-D data. An integrated human sensing strategy based on a hierarchically connected tree structure is proposed to fuse localized visual information. Constituting the tree model are the nodes representing keypoints and the edges representing keyparts, which are consistently interconnected to preserve the structural constraints during multi-source fusion. Utilizing RGB-D data and HRNet, the 3D positions of keypoints are analytically estimated, and their presence is inferred through a sliding widow of confidence scores. Subsequently, the point clouds of reliable keyparts are extracted by drawing occlusion-resistant masks, enabling fine registration between data clouds and cylindrical model following the hierarchical order. Experimental results demonstrate that our method enhances keypart recognition recall from 69.20% to 90.10%, compared to employing a single static camera. Furthermore, in overcoming challenges related to localized and occluded perception, the robotic arm's obstacle avoidance capabilities are effectively improved.
Robotics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to achieve comprehensive perception of human obstacles through multi - view active perception technology in human - robot interaction (HRI) so as to improve the safety and task - execution ability of robots. Specifically, the existing visual sensing methods usually rely on a single static camera, resulting in a limited field of view and being easily occluded. This not only affects the recognition and avoidance of human obstacles but also limits the ability of the robot arm to work efficiently in a dynamic environment. To overcome these challenges, the paper proposes a multi - view active perception system based on a hierarchical connection tree structure. This system uses multiple rotatable cameras to dynamically capture multi - source RGB - D data and constructs a human body model by fusing local visual information. This method can effectively improve the recall rate of key part recognition, from 69.20% to 90.10%, and performs well in dealing with local and occluded perception problems, thereby significantly improving the obstacle - avoidance ability of the robot arm. ### Main Contributions 1. **Multi - camera Active Vision System**: This system can capture RGB - D data from multiple important areas, expanding the perception range. 2. **Hierarchical Connection Tree Structure**: Used to integrate visual information from multi - view dynamic sources and maintain structural constraints. 3. **Information Extraction Method for Anti - occlusion and Local Field of View**: It can effectively handle occlusion and local field of view problems in HRI scenarios. ### Method Overview 1. **Multi - view Active Vision Mechanism**: By adding rotational degrees of freedom to the cameras, dynamic capture of multiple key areas is achieved. 2. **State Estimation of Key Points and Key Parts**: - Use HRNet to infer 2D key point positions and their confidence levels from color images. - Use depth images to lift 2D key point positions to 3D space. - Determine the existence state of key points through the sliding window method. - Fuse the key point position information of multiple cameras to obtain more accurate 3D positions. 3. **Key Part Point Cloud Extraction**: - Generate occlusion - resistant masks based on the positions of key points. - Apply masks to extract point clouds of key parts from depth images. - Use the ICP algorithm to register the point cloud with the cylindrical model and update the state of the key part. 4. **Hierarchical Connection Tree Model**: - Construct a hierarchical connection tree model to maintain anatomical constraints. - Maintain the connectivity of the tree model through supplementary nodes. - Perform state estimation in hierarchical order to ensure the connection relationship between each key part and its parent part. ### Experimental Verification The paper designs three typical production scenarios to verify the effectiveness and universality of the method: 1. **Simple Assembly Task**: A human operator performs a simple assembly task within the working area of the robot arm. 2. **Complex Assembly Task**: Involves more dynamic and complex interactions. 3. **Obstacle - avoidance Task**: Evaluate the obstacle - avoidance ability of the robot arm in different scenarios. The experimental results show that the proposed multi - view active perception system performs excellently in improving the key part recognition rate and the robot's obstacle - avoidance ability, significantly enhancing the safety and efficiency of human - robot interaction.