Transportation mode recognition based on low-rate acceleration and location signals with an attention-based multiple-instance learning network

Christos Siargkas,Vasileios Papapanagiotou,Anastasios Delopoulos
DOI: https://doi.org/10.1109/TITS.2024.3387834
2024-04-05
Abstract:Transportation mode recognition (TMR) is a critical component of human activity recognition (HAR) that focuses on understanding and identifying how people move within transportation systems. It is commonly based on leveraging inertial, location, or both types of signals, captured by modern smartphone devices. Each type has benefits (such as increased effectiveness) and drawbacks (such as increased battery consumption) depending on the transportation mode (TM). Combining the two types is challenging as they exhibit significant differences such as very different sampling rates. This paper focuses on the TMR task and proposes an approach for combining the two types of signals in an effective and robust classifier. Our network includes two sub-networks for processing acceleration and location signals separately, using different window sizes for each signal. The two sub-networks are designed to also embed the two types of signals into the same space so that we can then apply an attention-based multiple-instance learning classifier to recognize TM. We use very low sampling rates for both signal types to reduce battery consumption. We evaluate the proposed methodology on a publicly available dataset and compare against other well known algorithms.
Signal Processing,Machine Learning
What problem does this paper attempt to address?
The paper aims to address the problem of Traffic Mode Recognition (TMR). Specifically, the authors propose a novel method that combines low-sampling-rate acceleration signals and location signals to achieve efficient and robust traffic mode classification. Traditional traffic mode recognition systems typically rely on inertial sensors or location sensors, but these sensors each have their own advantages and disadvantages, such as battery consumption, signal availability, and different sampling rates. Therefore, the model proposed in this paper processes acceleration signals and location signals through two sub-networks, embedding these signals into the same feature space, and then applies an attention mechanism-based Multiple Instance Learning (MIL) classifier to recognize traffic modes. The main contributions include: 1. Proposing a novel, energy-efficient, and lightweight TMR system that combines low-sampling-rate acceleration signals and location signals collected by smartphone sensors, optimizing energy consumption while maintaining high recognition performance. 2. Designing an attention mechanism-based MIL framework (Fusion-MIL) that effectively fuses acceleration signals and location signals, overcoming issues such as sensor heterogeneity, different sampling rates, and signal unavailability. 3. Introducing additional processes such as data preprocessing, feature engineering, data augmentation, and pre-training to further enhance the model's effectiveness. 4. Conducting extensive experimental evaluations on public datasets, demonstrating the method's ability to accurately distinguish 8 different traffic modes in complex scenarios, surpassing existing technologies and other unimodal or multimodal algorithms.