MDST: 2-D Human Pose Estimation for SISO UWB Radar Based on Micro-Doppler Signature via Cascade and Parallel Swin Transformer

Xiaolong Zhou,Tian Jin,Yongpeng Dai,Yongping Song,Kemeng Li,Shaoqiu Song
DOI: https://doi.org/10.1109/jsen.2024.3401861
IF: 4.3
2024-07-02
IEEE Sensors Journal
Abstract:This article introduces the human pose estimation based on the single-input single-output (SISO) ultra-wideband (UWB) radar (HPSUR) benchmark, a pioneering approach in human pose estimation integrating motion capture (MOCAP) technology based on SISO UWB radar sensors. The HPSUR dataset, consisting of 311963 data frames, was meticulously assembled using cross-calibrated SISO UWB radar sensors in conjunction with the Noitom Perception Neuron 3 (N3), specifically designed for radar-based human pose estimation. This dataset captures diverse movements from five subjects of varying physical characteristics, performing four distinct categories of actions. In addition to establishing this comprehensive benchmark, this article proposes an innovative framework for 2-D HPSUR. The framework leverages the processing of micro-Doppler (MD) signatures through a unique combination of cascade and parallel Swin Transformers. The MD signatures, reflective of human kinematics, form the basis for a novel methodology in posture identification, thus enhancing the perception of human postures. Addressing the challenge of managing long-range dependencies due to the high sampling rates of radar devices, we introduce the MD Swin Transformer (MDST) network. This novel transformer incorporates window-based multihead self-attention (W-MSA) and shifted window-based multihead self-attention (SW-MSA) models to capture the inner-frame and intraframe aspects of the MD signature adeptly. Furthermore, this study integrates an inverted feature pyramid network (IFPN) for an efficient multiscale feature representation, enriching the feature pyramid with high-level semantics. Our extensive experimental analysis, conducted on the HPSUR benchmark, demonstrates the significant enhancement in the accuracy of human pose estimation offered by the proposed MDST network. This improvement is consistently observed across six MDST variants under various conditions involving diverse subjects and postures, showcasing the robustness and generalizability of our approach.
engineering, electrical & electronic,instruments & instrumentation,physics, applied
What problem does this paper attempt to address?