AdaPose: Towards Cross-Site Device-Free Human Pose Estimation with Commodity WiFi

Yunjiao Zhou,Jianfei Yang,He Huang,Lihua Xie
2024-10-14
Abstract:WiFi-based pose estimation is a technology with great potential for the development of smart homes and metaverse avatar generation. However, current WiFi-based pose estimation methods are predominantly evaluated under controlled laboratory conditions with sophisticated vision models to acquire accurately labeled data. Furthermore, WiFi CSI is highly sensitive to environmental variables, and direct application of a pre-trained model to a new environment may yield suboptimal results due to domain shift. In this paper, we proposes a domain adaptation algorithm, AdaPose, designed specifically for weakly-supervised WiFi-based pose estimation. The proposed method aims to identify consistent human poses that are highly resistant to environmental dynamics. To achieve this goal, we introduce a Mapping Consistency Loss that aligns the domain discrepancy of source and target domains based on inner consistency between input and output at the mapping level. We conduct extensive experiments on domain adaptation in two different scenes using our self-collected pose estimation dataset containing WiFi CSI frames. The results demonstrate the effectiveness and robustness of AdaPose in eliminating domain shift, thereby facilitating the widespread application of WiFi-based pose estimation in smart cities.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problem this paper attempts to address is: **the domain adaptation issue of unsupervised or weakly supervised WiFi signal (CSI) human pose estimation across different environments**. Specifically, current WiFi-based pose estimation methods are primarily evaluated in controlled laboratory environments and rely on complex visual models to obtain precisely labeled data. However, WiFi CSI (Channel State Information) is highly sensitive to environmental variables, and directly applying pre-trained models to new environments can result in significant performance degradation due to domain shift. Therefore, this paper proposes a domain adaptation algorithm specifically for WiFi-based weakly supervised pose estimation—AdaPose, which aims to identify consistent human poses that are highly resistant to environmental dynamics. By introducing a Mapping Consistency Loss, AdaPose can align domain differences between the source and target domains, thereby improving the model's generalization ability in new environments. ### Main Contributions: 1. **Proposing AdaPose**: This is the first solution to address the cross-domain WiFi human pose estimation problem, capable of identifying consistent human poses with high robustness to environmental dynamics. 2. **Designing Mapping Consistency Loss**: This loss function aligns features between the source and target domains based on the intrinsic consistency between input and output, suitable for regression tasks. 3. **Experimental Validation**: Extensive experiments demonstrate AdaPose's strong domain adaptation capability in both unsupervised and semi-supervised learning. ### Background Challenges: - **Complexity**: Human pose estimation requires predicting the 2D positions of 17 complex joints, which demands the model to extract subtle and representative features from limited and coarse WiFi CSI data. - **Environmental Interference**: There are numerous environmental interference factors in WiFi data, which can severely affect the model's generalization ability in new environments. ### Solution: - **Deep Learning**: Traditional methods cannot effectively model complex human motion information and extract useful features, whereas deep learning methods can extract fine-grained features end-to-end and capture human motion information from limited and coarse WiFi CSI data. - **Domain Adaptation**: Existing WiFi domain adaptation methods mainly align domain differences at the feature level, which may lead to scale inconsistency. AdaPose aligns domain differences at the mapping level, ensuring scale invariance, thereby narrowing the gap between the source and target domains. ### Experimental Setup: - **System Configuration**: Two TP-Link N750 routers are used as the transmitter and receiver, and a camera is used to collect visual information to generate labels. - **Data Collection**: Data is collected in two different scenarios, each containing 13728 and 9504 frames of video and WiFi data. - **Evaluation Metrics**: The PCK (Percentage of Correct Keypoints) metric is used to evaluate the model's accuracy. Through these methods, AdaPose can effectively address the domain adaptation issue of WiFi signals in different environments, thereby promoting the widespread application of WiFi-based human pose estimation technology in fields such as smart homes.