Contrastive Learning with Auxiliary User Detection for Identifying Activities

Wen Ge,Guanyi Mou,Emmanuel O. Agu,Kyumin Lee
2024-10-21
Abstract:Human Activity Recognition (HAR) is essential in ubiquitous computing, with far-reaching real-world applications. While recent SOTA HAR research has demonstrated impressive performance, some key aspects remain under-explored. Firstly, HAR can be both highly contextualized and personalized. However, prior work has predominantly focused on being Context-Aware (CA) while largely ignoring the necessity of being User-Aware (UA). We argue that addressing the impact of innate user action-performing differences is equally crucial as considering external contextual environment settings in HAR tasks. Secondly, being user-aware makes the model acknowledge user discrepancies but does not necessarily guarantee mitigation of these discrepancies, i.e., unified predictions under the same activities. There is a need for a methodology that explicitly enforces closer (different user, same activity) representations. To bridge this gap, we introduce CLAUDIA, a novel framework designed to address these issues. Specifically, we expand the contextual scope of the CA-HAR task by integrating User Identification (UI) within the CA-HAR framework, jointly predicting both CA-HAR and UI in a new task called User and Context-Aware HAR (UCA-HAR). This approach enriches personalized and contextual understanding by jointly learning user-invariant and user-specific patterns. Inspired by SOTA designs in the visual domain, we introduce a supervised contrastive loss objective on instance-instance pairs to enhance model efficacy and improve learned feature quality. Evaluation across three real-world CA-HAR datasets reveals substantial performance enhancements, with average improvements ranging from 5.8% to 14.1% in Matthew's Correlation Coefficient and 3.0% to 7.2% in Macro F1 score.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The problems that this paper attempts to solve mainly focus on two key aspects in the field of human activity recognition (HAR): 1. **Fusion of context - awareness and user - awareness**: Although existing HAR research has made significant progress in context - awareness (CA), most works have ignored the importance of user - awareness (UA). The paper points out that in order to achieve the best performance, HAR models need to not only consider the influence of the external environment, but also be able to recognize the differences among different users when performing the same activity. Therefore, the paper proposes a new framework aiming to consider both context and users' personalized characteristics simultaneously. 2. **Unified activity recognition**: The paper emphasizes that HAR models should be able to reliably recognize the same activity when different users perform it. This means that the model needs to learn that the representations of the same activity should be as close as possible even among different users. In addition, identifying the user who performs the activity is very important in some application scenarios, such as preventing others from performing medically - prescribed activities on behalf of the patient. To solve the above problems, the paper introduces a new framework named **Contrastive Learning with Auxiliary User Detection for Identifying Activities (CLAUDIA)**. Specifically, CLAUDIA extends the traditional context - aware HAR task in the following ways: - **Integrated user identification (UI)**: Add the user identification (UI) task to the context - aware HAR (CA - HAR) framework, and jointly predict context - aware HAR and user identification to form a new task - User and Context - Aware HAR (UCA - HAR). This method enriches the understanding of personalization and context by jointly learning user - invariant and user - specific patterns. - **Supervised contrastive loss**: Inspired by the relationship modeling between instances in the fields of computer vision and natural language processing, the paper introduces a supervised contrastive loss that acts on instance pairs to enhance the model's effect and improve the quality of the learned features. This loss function effectively captures user - specific and user - invariant information by reducing the distance between the representation vectors of different users performing the same activity. Through theoretical analysis, empirical research on real - world datasets, and strict experimental verification, the paper shows the importance of each component in the CLAUDIA framework and its relationship with existing methods. The experimental results show that CLAUDIA significantly improves the performance of HAR on multiple real - world datasets, with an average improvement ranging from 11.7% to 14.2% in Matthew’s Correlation Coefficient (MCC) and from 5.4% to 7.3% in Macro F1 score.