A Multiscale Cross-Modal Interactive Fusion Network for Human Activity Recognition Using Wearable Sensors and Smartphones

Xin Yang,Zeju Xu,Haodong Liu,Peter B. Shull,Stephen Redmond,Guanzheng Liu,Changhong Wang
DOI: https://doi.org/10.1109/jiot.2024.3400022
IF: 10.6
2024-01-01
IEEE Internet of Things Journal
Abstract:Human activity recognition (HAR) enables real-time monitoring of human movement, posture, and activity level, and can provide valuable information for health management. With the continuous advancement of Internet of Things (IoT) technology, wearable sensors and smartphones equipped with various types of sensors have become widely utilized to collect multimodal data for HAR. However, in multimodal HAR, current fusion methods fall short in capturing inter-modality correlations, hampering the full exploitation of complementary information between modalities and leading to lower recognition accuracy. We thus propose a novel multiscale cross-modal interactive fusion network (MCIFN), which can fully capture correlations between various modalities and obtain an effective fused representation for HAR. Specifically, we employ a multiscale parallel convolution module to extract features from each modality at multiple scales. Then, an interactive fusion strategy based on the cross-modal attention mechanism is introduced to adjust and enhance each modality based on its correlations with other modalities. Additionally, to resolve the information redundancy caused by the interactive fusion strategy, we utilize a hybrid attention module to focus on important information in the fusion representation. Extensive experiments conducted on three publicly available datasets and one private dataset demonstrate that our proposed network outperforms the previous baseline networks for HAR. Additionally, our proposed fusion strategy yielded a notable improvement in accuracy ranging from 1.87% to 9.96% compared to existing strategies. These findings imply that our newly proposed network can realize comprehensive multimodal fusion and effectively enhance HAR accuracy, potentially contributing to advancements in individual health management and personalized healthcare interventions.
What problem does this paper attempt to address?