A Shortcut Enhanced LSTM-GCN Network for Multi-Sensor Based Human Motion Tracking

Xiaoyu Li,Chaoxiang Ye,Binhua Huang,Zhenning Zhou,Yuanzhe Su,Yue Ma,Zhengkun Yi,Xinyu Wu
DOI: https://doi.org/10.1109/tase.2023.3307890
IF: 6.636
2023-01-01
IEEE Transactions on Automation Science and Engineering
Abstract:Multi-sensor based motion tracking is of great interest to the robotics community as it may lessen the need for expensive optical motion capture equipment. However, the traditional convolution algorithms have difficulty adapting to the data due to the changes of joints’ relative position during motion. The time-series networks often used in the past ignore the spatial characteristics of sensors. We tackle this challenge by combining long short-term memory (LSTM) with graph convolution network (GCN), adding the prior knowledge of sensor distribution, and integrating it into the motion law through the adjacency matrix. This article proposes a novel shortcut enhanced LSTM-GCN network (SE-LSTM-GCN). It connects LSTM and GCN in sequence and extracts temporal and spatial features of data. At the same time, the shortcut is used in the network to enhance the output of two middle layers and to restore the filtered information. Our experimental results on two different motion tracking datasets show that the proposed network is able to learn the mapping relationship with better universality, less tracking error, and without increasing much training time, and can better perform human motion tracking tasks. Note to Practitioners —Accurate and real-time multiple soft sensors motion tracking suits are more accepted for their low cost. However, the soft-sensor based motion tracking is not comparable to the traditional optical equipment in prediction error. To this end, we present a novel network shortcut enhanced LSTM-GCN (SE-LSTM-GCN), consisting of shortcuts, long short-term memory (LSTM), and graph convolution network (GCN). The LSTM solves the non-linear and hysteresis of soft strain sensors, and GCN is integrated into the network since the knowledge of sensor location can be put into the adjacency matrix generated by the k-nearest neighbor (KNN). While shortcuts are used to enhance the output of middle layers to form combined features. Experimental results on two public datasets show that the proposed network is superior to competing algorithms in terms of prediction error. The network can be deployed in embedded devices, such as VR gloves to provide a better gaming experience. The current algorithm is based on the relationship between sensor data and distance. In future research, we will focus on adding other human kinematics laws to the network.
What problem does this paper attempt to address?