Abstract:This research combines the NAO robot platform to develop three single scene modes (visual line-following navigation, object location and grabbing, and moving object tracking and obstacle avoidance), and finally through the frame based expert system to form a knowledge base of each mode, and uses the inference engine to achieve multi-mode fusion. In the development of visual line-following navigation, the proposed fast path extraction processing method can not only accurately extract the path information in the case of noisy interference, but also improve the running speed to ensure the real-time performance of the robot. A slope compensation PID controller is proposed to control the parameters of the robot during walking, which ensures low error and high stability when the robot follows the navigation line. In the development of object positioning and grabbing, a monocular localization algorithm is proposed using the characteristics of QR code, then the kinematics of the robot arm is modeled. Finally, the end effector of the robot is moved to the specified position according to the three-dimensional coordinates of the QR code to grasp the target object. In the development of moving object tracking and obstacle avoidance, the NaoMark marker is used to simplify the characteristics of the moving target, and the parameters of the robot walking are controlled according to the central position of the marker in the image. In order to avoid the collision of the robot with the obstacle during the movement, the obstacle avoidance is realized by combining the ultrasonic sensor measurement and the artificial potential field algorithm. Finally, the implementation methods of each part of the above three modes are formed into a knowledge base in the form of a framework based expert system, and the knowledge is combined by the inference engine to realize the fusion of multiple modes to complete tasks that require multiple modes. This research has presented an original solution, as a model, which can enable robots to perform complex service tasks consisting of multiple connected actions in a dynamic environment. This proposed methodology can let the intelligent operation of the robot serve in different scenes and fast response based on its actual requirements. This studying will enhance the prototype robot vision function and development of additional value for consolidation robot mobility based on multi-mode fusion to increase service robots in the home environment of intelligent control capability. In future, it will be extended to the hospital medical care and public safety, and other professional purpose.

Decision Making of Mobile Robot based on Multimodal Fusion

An Auditory and Olfactory Data Fusion Algorithm Based on Spiking Neural Network for Mobile Robot

Multisensor Data Fusion for Robot Parts Recognition

Knowledge-based multimodal information fusion for role recognition and situation assessment by using mobile robot

A Multimodal Information Fusion Model for Robot Action Recognition with Time Series

Multimodal information fusion for human-robot interaction

Tell and show: Combining multiple modalities to communicate manipulation tasks to a robot

Multimodal integration learning of robot behavior using deep neural networks

Multimodal Fusion for Robotics

Progress and Prospects of Multimodal Fusion Methods in Physical Human–Robot Interaction: A Review

Multimodal Sensors and ML‐Based Data Fusion for Advanced Robots

Data fusion methods in multimodal human computer dialog.

A Multi-modal Virtual-Real Fusion System for Multi-task Human-Computer Interaction.

Cross-Modal Sentiment Sensing with Visual-Augmented Representation and Diverse Decision Fusion

Interpretation on Multi-modal Visual Fusion

Deep Multimodal Data Fusion

Multimodal fusion-powered English speaking robot

Likelihood confidence rating based multi-modal information fusion for robot fine operation

Intelligence Methods of Multi-Modal Information Fusion in Human-Computer Interaction

Humanoid robot action using multi-mode fusion based on frame expert system

A Framework for the Fusion of Visual and Tactile Modalities for Improving Robot Perception