Abstract:Although there are many wearable sensors that make the acquisition of multi-modality data easier, effective feature extraction and fusion of the data is still challenging for lower limb locomotion mode recognition. In this article, a novel neural network is proposed for accurate prediction of five common lower limb locomotion modes including level walking, ramp ascent, ramp descent, stair ascent, and stair descent. First, the encoder-decoder structure is employed to enrich the channel diversity for the separation of the useful patterns from combined patterns. Second, a self-attention based cross-modality interaction module is proposed, which enables bilateral information flow between two encoding paths to fully exploit the interdependencies and to find complementary information between modalities. Third, a multi-modality fusion module is designed where the complementary features are fused by a channel-wise weighted summation whose coefficients are learned end-to-end. A benchmark dataset is collected from 10 health subjects containing EMG and IMU signals and five locomotion modes. Extensive experiments are conducted on one publicly available dataset ENABL3S and one self-collected dataset. The results show that the proposed method outperforms the compared methods with higher classification accuracy. The proposed method achieves a classification accuracy of 98.25 $\%$ on ENABL3S dataset and 95.51 $\%$ on the self-collected dataset. Note to Practitioners —This article aims to solve the real challenges encountered when intelligent recognition algorithms are applied in wearable robots: how to effectively and efficiently fuse the multi-modality data for better decision-making. First, most existing methods directly concatenate the multi-modality data, which increases the data dimensionality and brings computational burden. Second, existing recognition neural networks continuously compress the feature size such that the discriminative patterns are submerged in the noise and thus difficult to be identified. This research decomposes the mixed input signals on the channel dimension such that the useful patterns can be separated. Moreover, this research employs self-attention mechanism to associate correlations between two modalities and use this correlation as a new feature for subsequent representation learning, generating new, compact, and complementary features for classification. We demonstrate that the proposed network achieves 98.25 $\%$ accuracy and 3.5 ms prediction time. We anticipate that the proposed network could be a general scientific and practical methodology of multi-modality signal fusion and feature learning for intelligent systems.

Multimodal Information Bottleneck for Deep Reinforcement Learning with Multiple Sensors

Multimodal representation models for prediction and control from partial information

Learning End-to-end Multimodal Sensor Policies for Autonomous Navigation

Cross-Modality Self-Attention and Fusion-Based Neural Network for Lower Limb Locomotion Mode Recognition

Multimodal Information Bottleneck: Learning Minimal Sufficient Unimodal and Multimodal Representations

Neuro-Inspired Information-Theoretic Hierarchical Perception for Multimodal Learning

Multimodal fusion for autonomous navigation via deep reinforcement learning with sparse rewards and hindsight experience replay

Dynamic bottleneck with a predictable prior for image-based deep reinforcement learning

Multimodal integration learning of robot behavior using deep neural networks

Self-supervised Sequential Information Bottleneck for Robust Exploration in Deep Reinforcement Learning

DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning

Information-Theoretic Odometry Learning

Information-Bottleneck-Based Behavior Representation Learning for Multi-agent Reinforcement learning

On-the-fly Modulation for Balanced Multimodal Learning

Jointly Optimizing Sensing Pipelines for Multimodal Mixed Reality Interaction

Combining Reconstruction and Contrastive Methods for Multimodal Representations in RL

Multimodal Representation Learning for Place Recognition Using Deep Hebbian Predictive Coding

MBC: Multi-Brain Collaborative Control for Quadruped Robots

Dual Models to Facilitate Learning of Policy Network

Modality Competition: What Makes Joint Training of Multi-modal Network Fail in Deep Learning? (Provably).

Multi-Modal Fusion in Contact-Rich Precise Tasks via Hierarchical Policy Learning