When the Small-Loss Trick is Not Enough: Multi-Label Image Classification with Noisy Labels Applied to CCTV Sewer Inspections

Keryan Chelouche,Marie Lachaize,Marine Bernard,Louise Olgiati,Remi Cuingnet
2024-10-10
Abstract:The maintenance of sewerage networks, with their millions of kilometers of pipe, heavily relies on efficient Closed-Circuit Television (CCTV) inspections. Many promising approaches based on multi-label image classification have leveraged databases of historical inspection reports to automate these inspections. However, the significant presence of label noise in these databases, although known, has not been addressed. While extensive research has explored the issue of label noise in singlelabel classification (SLC), little attention has been paid to label noise in multi-label classification (MLC). To address this, we first adapted three sample selection SLC methods (Co-teaching, CoSELFIE, and DISC) that have proven robust to label noise. Our findings revealed that sample selection based solely on the small-loss trick can handle complex label noise, but it is sub-optimal. Adapting hybrid sample selection methods to noisy MLC appeared to be a more promising approach. In light of this, we developed a novel method named MHSS (Multi-label Hybrid Sample Selection) based on CoSELFIE. Through an in-depth comparative study, we demonstrated the superior performance of our approach in dealing with both synthetic complex noise and real noise, thus contributing to the ongoing efforts towards effective automation of CCTV sewer pipe inspections.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The paper aims to address the issue of noisy labels in multi-label image classification, particularly in the application of closed-circuit television (CCTV) sewer inspections. Specifically: 1. **Background Problem**: In the process of sewer maintenance, CCTV inspection is a common method, but there are a large number of noisy labels in historical databases, which can affect the performance of supervised learning algorithms. 2. **Research Objective**: The main goal of the paper is to develop a new method to handle the problem of noisy labels in multi-label classification and apply it to the CCTV sewer inspection dataset. To achieve this goal, the authors conducted research in the following areas: - **Adaptability Assessment of Existing Methods**: First, the authors attempted to adapt three sample selection methods from single-label classification (Co-teaching, CoSELFIE, and DISC) to the multi-label classification scenario and found that relying solely on the "small-loss trick" was not effective in handling complex noise. - **Proposal of a New Method**: Based on the above assessment results, the authors proposed a new method called MHSS (Multi-label Hybrid Sample Selection), which combines the advantages of CoSELFIE and is optimized for multi-label classification. - **Experimental Validation**: Experiments on two public datasets (UcMerced and TreeSatAI) were conducted to validate the effectiveness of the MHSS method, especially in scenarios with complex noise injection. In summary, the paper significantly improves the performance of multi-label classification on severely noisy datasets by proposing a new multi-label sample selection method, MHSS.