An Efficient Multi-Label Classification-Based Municipal Waste Image Identification

Rongxing Wu,Xingmin Liu,Tiantian Zhang,Jiawei Xia,Jiaqi Li,Mingan Zhu,Gaoquan Gu
DOI: https://doi.org/10.3390/pr12061075
IF: 3.5
2024-05-25
Processes
Abstract:Sustainable and green waste management has become increasingly crucial due to the rising volume of waste driven by urbanization and population growth. Deep learning models based on image recognition offer potential for advanced waste classification and recycling methods. However, traditional image recognition approaches usually rely on single-label images, neglecting the complexity of real-world waste occurrences. Moreover, there is a scarcity of recognition efforts directed at actual municipal waste data, with most studies confined to laboratory settings. Therefore, we introduce an efficient Query2Label (Q2L) framework, powered by the Vision Transformer (ViT-B/16) as its backbone and complemented by an innovative asymmetric loss function, designed to effectively handle the complexity of multi-label waste image classification. Our experiments on the newly developed municipal waste dataset "Garbage In, Garbage Out", which includes 25,000 street-level images, each potentially containing up to four types of waste, showcase the Q2L framework's exceptional ability to identify waste types with an accuracy exceeding 92.36%. Comprehensive ablation experiments, comparing different backbones, loss functions, and models substantiate the efficacy of our approach. Our model achieves superior performance compared to traditional models, with a mean average precision increase of up to 2.39% when utilizing the asymmetric loss function, and switching to ViT-B/16 backbone improves accuracy by 4.75% over ResNet-101.
engineering, chemical
What problem does this paper attempt to address?
This paper aims to address the efficiency and accuracy issues of multi-label garbage classification image recognition. With the increase of garbage volume brought by urbanization and population growth, sustainable and green garbage management has become crucial. Traditional image recognition methods often rely on single-label images, ignoring the complexity of garbage appearance in the real world. In addition, there have been relatively few recognition efforts on actual municipal garbage data, and most of the research is still limited to laboratory environments. To this end, the paper proposes a framework called Query2Label (Q2L), which is based on the Vision Transformer (ViT-B/16) as the core and adopts an innovative asymmetric loss function to effectively handle the complexity of multi-label garbage image classification. Experiments conducted on the newly developed "Garbage In, Garbage Out" (GIGO) municipal garbage image dataset show that the Q2L framework surpasses 92.36% accuracy in identifying garbage types. Compared to traditional models, the Q2L model achieved an average precision increase of 2.39% when utilizing the asymmetric loss function, and an accuracy improvement of 4.75% when switching the backend to ViT-B/16 compared to ResNet-101. This model has the potential to handle a large volume of municipal garbage classification in real time and efficiently, contributing to improving recycling efficiency, strengthening waste management, and protecting the urban environment and public health.