Dilated Light-Head R-CNN using tri-center loss for driving behavior recognition

Mingqi Lu,Yaocong Hu,Xiaobo Lu
DOI: https://doi.org/10.1016/j.imavis.2019.08.004
IF: 3.86
2019-01-01
Image and Vision Computing
Abstract:Unsafe driving behavior causes many traffic accidents, resulting in serious casualties and property losses. It is observed that what the driver is doing can be revealed from the clues on the image, such as the hand with a cigarette. However, the existing behavior recognition approaches are not ideal for distinguishing driving behavior with only local differences. In this paper, we recognize driving behavior by detecting the action-specific parts and propose Dilated Light-Head R-CNN (DL-RCNN) approach, which uses dilated convolution to ensure image resolution for critical details. The technical novelty includes: a position-sensitive RoI alignment to improve the perception of small objects, and tri-center loss to enforce similarity between intra-class features and difference between features of distinct classes. We also adopt two strategies including online hard example mining and proper calibration of key parameters. The evaluation results on the public Kaggle-driving data set and self-built data set show that DL-RCNN achieves state-of-the-art performance in recognizing driving behavior.
What problem does this paper attempt to address?