A Survey on Multimodal Wearable Sensor-based Human Action Recognition

Jianyuan Ni,Hao Tang,Syed Tousiful Haque,Yan Yan,Anne H.H. Ngu
2024-04-15
Abstract:The combination of increased life expectancy and falling birth rates is resulting in an aging population. Wearable Sensor-based Human Activity Recognition (WSHAR) emerges as a promising assistive technology to support the daily lives of older individuals, unlocking vast potential for human-centric applications. However, recent surveys in WSHAR have been limited, focusing either solely on deep learning approaches or on a single sensor modality. In real life, our human interact with the world in a multi-sensory way, where diverse information sources are intricately processed and interpreted to accomplish a complex and unified sensing system. To give machines similar intelligence, multimodal machine learning, which merges data from various sources, has become a popular research area with recent advancements. In this study, we present a comprehensive survey from a novel perspective on how to leverage multimodal learning to WSHAR domain for newcomers and researchers. We begin by presenting the recent sensor modalities as well as deep learning approaches in HAR. Subsequently, we explore the techniques used in present multimodal systems for WSHAR. This includes inter-multimodal systems which utilize sensor modalities from both visual and non-visual systems and intra-multimodal systems that simply take modalities from non-visual systems. After that, we focus on current multimodal learning approaches that have applied to solve some of the challenges existing in WSHAR. Specifically, we make extra efforts by connecting the existing multimodal literature from other domains, such as computer vision and natural language processing, with current WSHAR area. Finally, we identify the corresponding challenges and potential research direction in current WSHAR area for further improvement.
Signal Processing,Machine Learning,Multimedia
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve The paper primarily focuses on how to utilize multimodal learning methods to improve wearable sensor-based human activity recognition (WSHAR) systems. Specifically: 1. **Needs of an Aging Society**: - With the global population aging, improving the quality of life for the elderly through technological means has become an important research direction. WSHAR, as an assistive technology, has great potential in health monitoring, fall detection, and other areas. 2. **Limitations of Existing WSHAR Systems**: - Current WSHAR systems based on a single modality (such as visual or non-visual) have certain limitations, such as privacy issues and lack of robustness. Particularly, video-based methods, although rich in information, are not suitable for certain application scenarios (such as privacy protection). 3. **Advantages of Multimodal Learning**: - Multimodal learning can combine the advantages of different data sources to provide more accurate and robust activity recognition results. The paper emphasizes the importance of multimodal cognition in daily life and explores how this concept can be applied to the WSHAR field. 4. **Comprehensive Review of Existing Research**: - The paper provides a comprehensive review of existing WSHAR systems, including the characteristics of different sensor modalities and the application of deep learning methods. Additionally, the paper discusses the application of multimodal systems (such as cross-modal and intra-modal systems) in WSHAR. 5. **Future Research Directions**: - Finally, the paper identifies the challenges in the current WSHAR field and proposes possible future research directions, aiming to provide valuable references for new researchers entering the field. Through the above content, the paper aims to fill existing knowledge gaps and promote the further development of multimodal learning in the WSHAR field.