Abstract:Trustworthiness is a crucial concept in the context of human-robot interaction. Cooperative robots must be transparent regarding their decision-making process, especially when operating in a human-oriented environment. This paper presents a comprehensive end-to-end framework aimed at fostering trustworthy bidirectional human-robot interaction in collaborative environments for the social navigation of mobile robots. In this framework, the robot communicates verbally while the human guides with gestures. Our method enables a mobile robot to predict the trajectory of people and adjust its route in a socially-aware manner. In case of conflict between human and robot decisions, detected through visual examination, the route is dynamically modified based on human preference while verbal communication is maintained. We present our pipeline, framework design, and preliminary experiments that form the foundation of our proposition.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is **to achieve reliable two - way human - machine interaction in a collaborative environment, especially for mobile robots in social navigation**. Specifically, the paper focuses on the following aspects: 1. **Enhancing trust**: In human - machine interaction (HRI), especially when robots operate in human environments, transparency and credibility are crucial. Collaborative robots need to be transparent about their decision - making processes to ensure human trust in robots. 2. **Two - way interaction in social navigation**: Many existing studies have overlooked the importance of two - way interaction and communication. This paper proposes a framework that allows robots to explain their actions through voice and adjust their paths according to human gesture feedback, thereby improving long - term cooperation and trust. 3. **Predicting and adjusting trajectories**: Robots can predict the trajectories of surrounding people and dynamically adjust their own paths according to the prediction results to avoid conflicts and adapt to social norms. If there are conflicts between human and robot decisions, robots will detect these conflicts through visual inspection and dynamically modify their routes according to human preferences. 4. **Explaining the decision - making process**: In order to establish long - term trust, robots need not only to perform tasks but also to explain their decision - making processes to humans. This includes explaining why a specific path is chosen and how adjustments are made according to human feedback. ### Core contributions of the paper - **Social navigation architecture based on Graph Attention Network (GAT)**: This architecture can predict trajectories based on the relationships between individuals in the environment. - **Trustworthy artificial intelligence module**: This module can explain the decisions made by robots based on visual feedback and predicted trajectories. - **Two - way human - machine interaction**: By recognizing hand gestures and providing voice responses, robots can explain their decision - making processes, thereby maintaining long - term cooperation and trust with humans. ### Method overview 1. **Human detection and location**: By fusing RGB images and point clouds generated by 2D LIDAR, use weak perspective projection and instance segmentation algorithms to detect and locate humans. 2. **Trajectory prediction**: Use a pre - trained LSTM encoder to encode trajectories into dense graphs, and then use GAT to predict future positions. 3. **Path planning and adjustment**: Dynamically adjust the robot path according to the predicted trajectory and visual feedback, and explain its behavior through voice. 4. **Gesture recognition**: Recognize five gestures (wait, turn left, turn right, continue, unknown) through the Mediapipe model and adjust the path according to the gestures. ### Preliminary experiments At present, researchers have completed preliminary experiments on some components, including human location, trajectory encoding, and gesture classification. Future plans include collecting data in a smart factory environment, verifying the performance of the GAT architecture, and evaluating the credibility and ease - of - use of the system through user surveys. In conclusion, this paper aims to enhance the credibility and interaction ability of mobile robots in social navigation by introducing a two - way audio - visual interaction framework, thereby promoting trust and comfort in human - machine collaborative environments.

Bidirectional Human Interactive AI Framework for Social Robot Navigation

Social navigation framework for assistive robots in human inhabited unknown environments

Efficient and Trustworthy Social Navigation Via Explicit and Implicit Robot-Human Communication

A framework for trust-related knowledge transfer in human–robot interaction

Learning Early Social Maneuvers for Enhanced Social Navigation

Enabling Socially Competent navigation through incorporating HRI

Enhancing Socially-Aware Robot Navigation through Bidirectional Natural Language Conversation

EmoiPlanner: Human emotion and intention aware socially acceptable robot navigation in human‐centric environments

Language and Sketching: An LLM-driven Interactive Multimodal Multitask Robot Navigation Framework

Safe Human-Robot Collaborative Transportation via Trust-Driven Role Adaptation

A Complementary Framework for Human-Robot Collaboration with a Mixed AR-Haptic Interface

Autonomous Navigation For Mobile Robots With Human-Robot Interaction

Toward Mutual Trust Modeling in Human-Robot Collaboration

A Survey on Human-aware Robot Navigation

PANav: Toward Privacy-Aware Robot Navigation via Vision-Language Models

Socially Integrated Navigation: A Social Acting Robot with Deep Reinforcement Learning

Social Navigation with Human Empowerment driven Deep Reinforcement Learning

Learning Social Navigation from Demonstrations with Conditional Neural Processes

Language-guided Robust Navigation for Mobile Robots in Dynamically-changing Environments

Humanising robot-assisted navigation

Human-robot interaction through adjustable social autonomy