Real-time Clothing Detection Networks for Surveillance Videos

Zhiying Li,Hao Chen,Weibin Chen,Shuyuan Lin,Rong Zhao,Yufeng Qian,Zefan Wang,Yuer Yang,Feiran Huang
DOI: https://doi.org/10.1117/12.2680485
2022-01-01
Abstract:Real-time clothing recognition is a useful technique for describing individuals from the video stream captured by surveillance cameras. In most cases, surveillance cameras are shot from a high angle and a long distance, so the clothing objects in the captured images are small and blurred. Traditional object detection models are designed to locate obvious and clear objects and then classify them into different categories. They are not sensitive enough to detect or even neglect small and blurry clothes in surveillance camera images. To this end, we propose an effective real-time clothing detection model-Multi-Scale Tiny attention-based networks, called MuST. We also collect 5000 surveillance camera images and artificially annotate clothes into 112 categories to build an accurate benchmark dataset for clothing recognition. In particular, we develop a special tiny decoupled prediction head that helps to detect small and fuzzy clothes more accurately. Moreover, clothes are usually sheltered or affected by distracting environments, which may negatively affect the classification accuracy. Therefore, we introduce the novel multi-scale concatenation module to integrate and contrastingly analyze the information of the clothing objects and their local environment. Finally, MuST can better localize and classify small and fuzzy clothing objects. Experimental results on real-world collected clothing recognition datasets prove that MuST achieves the best recognition accuracy among all real-time object recognition models. Moreover, MuST achieves optimal inference speed among all real-time and non-real-time object recognition models.
What problem does this paper attempt to address?