WiMix: A Lightweight Multimodal Human Activity Recognition System Based on WiFi and Vision

Jiajing Chen,Kun Yang,Xiaolong Zheng,Shengbo Dong,Liang Liu,Huadong Ma
DOI: https://doi.org/10.1109/mass58611.2023.00057
2023-01-01
Abstract:Human activity recognition is important for a wide range of applications such as surveillance systems and human-computer interaction. Computer vision based human activity recognition suffers from performance degradation in many real-world scenarios where the illumination is poor. On the other hand, recently proposed WiFi sensing that leverage ubiquitous WiFi signal for activity recognition is not affected by illumination but has low accuracy in dynamic environments. In this paper, we propose WiMix, a lightweight and robust multimodal system that leverages both WiFi and vision for human activity recognition. To deal with complex real-world environments, we design a lightweight mix cross attention module for automatic WiFi and video weight distribution. To reduce the system response time while ensuring the sensing accuracy, we design an end-to-end framework together with an efficient classifier to extract spatial and temporal features of two modalities. Extensive experiments are conducted in the real-world scenarios and the results demonstrate that WiMix achieves 98.5% activity recognition accuracy in 3 scenarios, which outperforms the state-of-the-art 89.6% sensing accuracy using WiFi and video modalities. WiMix can also reduce the inference latency from 1268.25ms to 217.36ms, significantly improving the response time.
What problem does this paper attempt to address?