ML-CapsNet meets VB-DI-D: A novel distortion-tolerant baseline for
Zhongqi Lin,Zengwei Zheng,Jingdun Jia,Wanlin Gao,Feng Huang
DOI: https://doi.org/10.1016/j.engappai.2023.105937
2023-01-01
Abstract:Suffering from spatiotemporal-varying perturbations (e.g., overexposure, jitter, and motion), the gathered im-ages frequently undergo visual distortions (e.g., shear, defocus blur, affine transformation, and speckle noise). Due to the lack of effective information carriers, prior works cannot extract sufficient original representations of instances from corrupted images, thus fail to extrapolate to various geometric transformations. This paper proposes a Distortion-Tolerant Capsule Network (DT-CapsNet) to realize object detection whilst resisting visual distortions. It first learns the distribution of capsule encoding vectors as a new information carrier by casting a feature extractor dubbed Multi-lane Capsule Network (ML-CapsNet). This model consists of three independent encoder lanes and runs under the support of modified Segment-By-Segment Dynamic Routing Agreement (SBS-DRA). Then the invariant dimension detection and elimination, descriptor generation, and correspondence establishment are conducted on the learned vector distributions by casting a adaptive algorithm dubbed Vector -Based Deformation-Invariant Descriptor (VB-DI-D). Finally, a reliable soft matching with relaxation margins between the patterns of original standard instances and those of disturbed instances. Quantitative and ablation verifications demonstrate that DT-CapsNet can deliver competitive perturbed object detection performance among state-of-the-arts, i.e., achieves the highest testing accuracy (90.86% versus the second highest score 90.53%) on hand-crafted wheat dataset, and achieves the highest average testing accuracy (91.18% versus the second highest score 91.15%) on three public benchmarks (Stanford Cars, Stanford Dogs, CUB-200-2011). The results evidence that DT-CapsNet indeed improves the invariance against numerous encountered geometric distortions.