S5Utis: Structured State-Space Sequence SegNeXt UNet-like Tongue Image Segmentation in Traditional Chinese Medicine

Donglei Song,Hongda Zhang,Lida Shi,Hao Xu,Ying Xu
DOI: https://doi.org/10.3390/s24134046
IF: 3.9
2024-06-21
Sensors
Abstract:Intelligent Traditional Chinese Medicine can provide people with a convenient way to participate in daily health care. The ease of acceptance of Traditional Chinese Medicine is also a major advantage in promoting health management. In Traditional Chinese Medicine, tongue imaging is an important step in the examination process. The segmentation and processing of the tongue image directly affects the results of intelligent Traditional Chinese Medicine diagnosis. As intelligent Traditional Chinese Medicine continues to develop, remote diagnosis and patient participation will play important roles. Smartphone sensor cameras can provide irreplaceable data collection capabilities in enhancing interaction in smart Traditional Chinese Medicine. However, these factors lead to differences in the size and quality of the captured images due to factors such as differences in shooting equipment, professionalism of the photographer, and the subject's cooperation. Most current tongue image segmentation algorithms are based on data collected by professional tongue diagnosis instruments in standard environments, and are not able to demonstrate the tongue image segmentation effect in complex environments. Therefore, we propose a segmentation algorithm for tongue images collected in complex multi-device and multi-user environments. We use convolutional attention and extend state space models to the 2D environment in the encoder. Then, cross-layer connection fusion is used in the decoder part to fuse shallow texture and deep semantic features. Through segmentation experiments on tongue image datasets collected by patients and doctors in real-world settings, our algorithm significantly improves segmentation performance and accuracy.
engineering, electrical & electronic,chemistry, analytical,instruments & instrumentation
What problem does this paper attempt to address?
The paper mainly focuses on the segmentation problem of tongue images in traditional Chinese medicine, especially in the context of complex multi-device and multi-user environments. With the development of technology and increasing attention to health management, smartphone sensor cameras provide convenience for remote medical care and intelligent traditional Chinese medicine. Tongue image is an important step in traditional Chinese medicine diagnosis, and the quality and segmentation effect of its image directly affect the accuracy of intelligent traditional Chinese medicine diagnosis. Currently, most tongue image segmentation algorithms rely on data collected by professional devices in standardized environments and cannot adapt to the complex environments in the real world. Therefore, the paper proposes a new algorithm called S5Utis (Structured State-Space Sequence SegNeXt UNet-like Tongue Image Segmentation), which utilizes convolutional attention mechanism and a structured state-space sequence model (S4) extended to 2D environment. In the decoder part, shallow texture features and deep semantic features are fused through cross-layer connections to improve the segmentation performance and accuracy of tongue images collected under different devices and user conditions. The experiments show that S5Utis achieves better segmentation accuracy on tongue images taken by non-professional devices and non-laboratory personnel, and has advantages over other popular semantic segmentation networks. The main contributions of the paper include designing a tongue image segmentation model based on the SegNeXt network, using improved S4-2D convolution self-attention for multi-scale feature fusion, and adopting cross-layer connections and residual connections in the decoder for layer-wise upsampling to improve segmentation accuracy.