Synthetic Smartwatch IMU Data Generation from In-the-wild ASL Videos

Panneer Selvam Santhalingam,Parth Pathak,Huzefa Rangwala,Jana Kosecka
DOI: https://doi.org/10.1145/3596261
2023-06-13
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
Abstract:The scarcity of training data available for IMUs in wearables poses a serious challenge for IMU-based American Sign Language (ASL) recognition. In this paper, we ask the following question: can we "translate" the large number of publicly available, in-the-wild ASL videos to their corresponding IMU data? We answer this question by presenting a video to IMU translation framework (Vi2IMU) that takes as input user videos and estimates the IMU acceleration and gyro from the perspective of user's wrist. Vi2IMU consists of two modules, a wrist orientation estimation module that accounts for wrist rotations by carefully incorporating hand joint positions, and an acceleration and gyro prediction module, that leverages the orientation for transformation while capturing the contributions of hand movements and shape to produce realistic wrist acceleration and gyro data. We evaluate Vi2IMU by translating publicly available ASL videos to their corresponding wrist IMU data and train a gesture recognition model purely using the translated data. Our results show that the model using translated data performs reasonably well compared to the same model trained using measured IMU data.
What problem does this paper attempt to address?