Fine-grained Recognition of Manipulation Activities on Objects via Multi-modal Sensing
Xiulong Liu,Bojun Zhang,Lizhang Wang,Sheng Chen,Xin Xie,Xinyu Tong,Tao Gu,Keqiu Li
DOI: https://doi.org/10.1109/tmc.2024.3364522
IF: 6.075
2024-01-01
IEEE Transactions on Mobile Computing
Abstract:Fine-grained recognition of human manipulation activities on objects is crucial in the era of human-computer-object integration. However, there is a lack of solutions for simultaneous recognition of human identity, manipulation activities (including drawing and rotation), and manipulated objects. Therefore, we propose an RF-Camera system that combines RFID and computer vision techniques to address this challenge in multi-person and multi-object scenarios. In RF-Camera, we employ a skeleton-assisted method to extract facial images of target individuals, enabling precise recognition of their identities. To identify manipulation activities, we analyze the 3D hand trajectory and fingertip vector angle, differentiating drawing and rotation manipulation activities. Additionally, we model target person?s hand movements to predict phase data of the target tag, enabling the determination of person-object relationships. Implementing RF-Camera using COTS RFID and Kinect devices involves overcoming challenges such as extracting effective data from noisy streams, predicting virtual phase data considering hand-tag offset, and ensuring high tag reading rates in tag-dense scenarios. We conducted experiments involving six participants performing object manipulation activities, including drawing letters/symbols and rotating movements. Extensive experimental results show that RF-Camera achieves over 90% accuracy in recognizing person identity, manipulation activities, and person-object matching in most conditions.
computer science, information systems,telecommunications