Multi-modal body part segmentation of infants using deep learning

Florian Voss,Noah Brechmann,Simon Lyra,Jöran Rixen,Steffen Leonhardt,Christoph Hoog Antink
DOI: https://doi.org/10.1186/s12938-023-01092-0
2023-03-22
Abstract:Background: Monitoring the body temperature of premature infants is vital, as it allows optimal temperature control and may provide early warning signs for severe diseases such as sepsis. Thermography may be a non-contact and wireless alternative to state-of-the-art, cable-based methods. For monitoring use in clinical practice, automatic segmentation of the different body regions is necessary due to the movement of the infant. Methods: This work presents and evaluates algorithms for automatic segmentation of infant body parts using deep learning methods. Based on a U-Net architecture, three neural networks were developed and compared. While the first two only used one imaging modality (visible light or thermography), the third applied a feature fusion of both. For training and evaluation, a dataset containing 600 visible light and 600 thermography images from 20 recordings of infants was created and manually labeled. In addition, we used transfer learning on publicly available datasets of adults in combination with data augmentation to improve the segmentation results. Results: Individual optimization of the three deep learning models revealed that transfer learning and data augmentation improved segmentation regardless of the imaging modality. The fusion model achieved the best results during the final evaluation with a mean Intersection-over-Union (mIoU) of 0.85, closely followed by the RGB model. Only the thermography model achieved a lower accuracy (mIoU of 0.75). The results of the individual classes showed that all body parts were well-segmented, only the accuracy on the torso is inferior since the models struggle when only small areas of the skin are visible. Conclusion: The presented multi-modal neural networks represent a new approach to the problem of infant body segmentation with limited available data. Robust results were obtained by applying feature fusion, cross-modality transfer learning and classical augmentation strategies.
What problem does this paper attempt to address?