Detecting And Tracking Objects In Hri: Yolo Networks For The Nao "I See You" Function

Jiahuan Zhou,Lihang Feng,Ryad Chellali,Haonan Zhu
DOI: https://doi.org/10.1109/ROMAN.2018.8525582
2018-01-01
Abstract:Object detection and tracking is a basic but a key feature in many robotics tasks, including for human-robots interactions. Handing over objects for instance rely almost exclusively on such a capability: robots should be able to detect the object of interest, localize and track its movements in order to synchronize temporally and spatially the transition phase. In the past few years, a variety of architectures based on convolutional neural networks, especially that of the R-CNN and YOLO (You Only Look Once) models, largely contribute to the improvement of detection accuracy and efficiency. In this paper, we design a visual detection system to detect some objects the Nao robot should detect and track based on the YOLO architecture. We adopt the tiny YOLO network as a pretrained model and modify the last fully connected layer for three-class object detection. The model is trained on 4322 training images and achieves the mean average precision (mAP) of 44.3% on our test set. This model is then applied to a well-known function of the Nao vision system: detecting and localizing landmarks. It is also applied to daily objects. The results demonstrate that it could help localizing and tracking accurately objects of interest in HRI (human-robot interaction).
What problem does this paper attempt to address?