Multi-Scale Spiking Pyramid Wireless Communication Framework for Food Recognition

Wenrui Li,Jiahui Li,Mengyao Ma,Xiaopeng Hong,Xiaopeng Fan
DOI: https://doi.org/10.1109/tmm.2024.3368964
IF: 7.3
2024-01-01
IEEE Transactions on Multimedia
Abstract:Food recognition applications in human health have recently garnered significant attention in the field of computer vision. With the advancement of mobile devices, robust food recognition in wireless communication has become a practical and challenging application scenario. We propose a novel Multi-scale Spiking Pyramid Transmission Network (MSPTN) to tackle this challenge. The MSPTN learns diverse and complementary local and global feature maps simultaneously, generating a comprehensive description of food images that capture the correlations of feed-specific features. The feature sender uses a three-layer Spiking Neural Network (SNN). The proposed sender compresses features into sparse and discrete spike trains, significantly reducing the required transmission bandwidth and improving channel utilization and energy efficiency. Our model introduces the Compressed Factorized Bilinear block (CFB), which employs a low-rank feature approximation to reduce computational complexity and feature transmission volume while preserving the discriminate features. The enhancement reasoning module is proposed to enhance the received features by projecting them into a higher-dimensional space and utilizing the self-attention mechanism and sum pooling to compress them back to the original dimension. We conduct extensive experiments on the ETH Food-101 and Food2k datasets. Our results reveal that the MSPTN demonstrates state-of-the-art recognition performance, even with binary spike trains. Meanwhile, the MSPTN also exhibits remarkable robustness in wireless communication scenarios. With the combination of CFB, SNN, and EFB, our model achieves significant efficiency gains, including a nearly nine-fold decrease in feature transmission volume and a three-fold improvement in runtime & computational memory speed.
computer science, information systems,telecommunications, software engineering
What problem does this paper attempt to address?