Deep Multimodal Learning for Municipal Solid Waste Sorting

Lu Gang,Wang YuanBin,Xu HuXiu,Yang HuaYong,Zou Jun
DOI: https://doi.org/10.1007/s11431-021-1927-9
2021-01-01
Abstract:Automated waste sorting can dramatically increase waste sorting efficiency and reduce its regulation cost. Most of the current methods only use a single modality such as image data or acoustic data for waste classification, which makes it difficult to classify mixed and confusable wastes. In these complex situations, using multiple modalities becomes necessary to achieve a high classification accuracy. Traditionally, the fusion of multiple modalities has been limited by fixed handcrafted features. In this study, the deep-learning approach was applied to the multimodal fusion at the feature level for municipal solid-waste sorting. More specifically, the pre-trained VGG16 and one-dimensional convolutional neural networks (1D CNNs) were utilized to extract features from visual data and acoustic data, respectively. These deeply learned features were then fused in the fully connected layers for classification. The results of comparative experiments proved that the proposed method was superior to the single-modality methods. Additionally, the feature-based fusion strategy performed better than the decision-based strategy with deeply learned features.
What problem does this paper attempt to address?