Learning from ambiguous labels for X-Ray security inspection via weakly supervised correction

Wei Wang,Linyang He,Guohua Cheng,Ting Wen,Yan Tian
DOI: https://doi.org/10.1007/s11042-023-15299-9
IF: 2.577
2024-01-01
Multimedia Tools and Applications
Abstract:X-ray security inspection has been dominated by supervised learning detectors for several years. The extreme angles, overlapping occlusion, and diversity of inspected items cause ambiguous objects to appear, bringing ambiguous labels to the training processes of the supervised learning network. It is well known that the training performance of a supervised learning detector is extremely dependent on the quality of the labels. Human-annotated labels are less reliable and more inconsistent due to the loss of key features of ambiguous objects. With the increase in the proportion of unreliable labels, highly negative effects are imposed on contraband detection. To mitigate this problem, an end-to-end weakly supervised correction (WSC) method with three modules for denoising and rectifying ambiguous labels is proposed. (1) X-ray energy awareness blending (X-Blending) extracts ambiguous images and reliable images during each iteration and mixes them into a single image, which improves the stability and efficiency of ambiguous image training. (2) A weakly supervised head (WSH) is embedded in the supervised detector to rectify the noise labels of ambiguous objects. (3) An adaptive label corrector (ALC) dynamically combines object similarity and confidence measures to generate credible labels and reweights factors to adjust sample contributions. WSC is the first work to achieve end-to-end ambiguous label rectification in the field of contraband detection. Different from traditional contraband detection models, WSC innovatively combines weakly supervised learning to provide more prior knowledge for uncertainty label learning and obtain effective feature information from ambiguous objects. When applied to Faster R-CNN, experimental validations show that WSC increases the average precision (AP) by 3.3 % and 4.5 % on the EDXray and PIDray datasets.
What problem does this paper attempt to address?