Convolution‐enhanced vision transformer method for lower limb exoskeleton locomotion mode recognition

Jianbin Zheng,Chaojie Wang,Liping Huang,Yifan Gao,Ruoxi Yan,Chunbo Yang,Yang Gao,Yu Wang
DOI: https://doi.org/10.1111/exsy.13659
IF: 3.3
2024-06-20
Expert Systems
Abstract:Providing the human body with smooth and natural assistance through lower limb exoskeletons is crucial. However, a significant challenge is identifying various locomotion modes to enable the exoskeleton to offer seamless support. In this study, we propose a method for locomotion mode recognition named Convolution‐enhanced Vision Transformer (Conv‐ViT). This method maximizes the benefits of convolution for feature extraction and fusion, as well as the self‐attention mechanism of the Transformer, to efficiently capture and handle long‐term dependencies among different positions within the input sequence. By equipping the exoskeleton with inertial measurement units, we collected motion data from 27 healthy subjects, using it as input to train the Conv‐ViT model. To ensure the exoskeleton's stability and safety during transitions between various locomotion modes, we not only examined the typical five steady modes (involving walking on level ground [WL], stair ascent [SA], stair descent [SD], ramp ascent [RA], and ramp descent [RD]) but also extensively explored eight locomotion transitions (including WL‐SA, WL‐SD, WL‐RA, WL‐RD, SA‐WL, SD‐WL, RA‐WL, RD‐WL). In tasks involving the recognition of five steady locomotions and eight transitions, the recognition accuracy reached 98.87% and 96.74%, respectively. Compared with three popular algorithms, ViT, convolutional neural networks, and support vector machine, the results show that the proposed method has the best recognition performance, and there are highly significant differences in accuracy and F1 score compared to other methods. Finally, we also demonstrated the excellent performance of Conv‐ViT in terms of generalization performance.
computer science, artificial intelligence, theory & methods
What problem does this paper attempt to address?