Understanding the Roles of Video and Sensor Data in the Annotation of Human Activities

Michael Jones,Courtni Byun,Naomi Johnson,Kevin Seppi
DOI: https://doi.org/10.1080/10447318.2022.2101589
2022-08-03
Abstract:Human activities can be recognized in sensor data using supervised machine learning algorithms. In this approach, human annotators must annotate events in the sensor data which are used as input to supervised learning algorithms. Annotating events directly in time series graphs of data streams is difficult. Video is often collected and synchronized to the sensor data to aid human annotators in identifying events in the data. Other work in human activity recognition (HAR) minimizes the cost of annotation by using unsupervised or semi-supervised machine learning algorithms or using algorithms that are more tolerant of human annotation errors. Rather than adjusting algorithms, we focus on the performance of the human annotators themselves. Understanding how human annotators perform annotation may lead to annotation interfaces and data collection schemes that better support annotators. We investigate the accuracy and efficiency of human annotators in the context of four HAR tasks when using video, data, or both to annotate events. After a training period, we found that annotators were more efficient when using data alone on three of four tasks and more accurate when marking event types when using video alone on all four tasks. Annotators were more accurate when marking event boundaries using data alone on two tasks and more accurate using video alone on the other two tasks. Our results suggest that data and video collected for annotation of HAR tasks play different roles in the annotation process and these roles may vary with the HAR task.
computer science, cybernetics,ergonomics
What problem does this paper attempt to address?