A semantic segmentation algorithm for fashion images based on modified mask RCNN
He, Wentao,Wang, Jing'an,Wang, Lei,Pan, Ruru,Gao, Weidong
DOI: https://doi.org/10.1007/s11042-023-14958-1
IF: 2.577
2023-03-14
Multimedia Tools and Applications
Abstract:The semantic segmentation of human body images has huge application potential in many fields, such as autonomous driving, artificial intelligence (AI) face changing, and virtual try-on. Nowadays, many researchers use additional human body posture information to generate multi-level human body analysis images. However, the existing method has limitations when faced with multiple poses and overlapping targets. In this paper, a novel algorithm based on Mask RCNN which has pixel-level accuracy is proposed. In the feature extraction process, a multi-scale feature fusion module applying dilated convolution is proposed to obtain richer semantic information from different perceptual fields. We added a small residual module to the original residual unit structure to increase the size of the receptive field of each layer to capture details and global characteristics. Three convolution kernels with different ratios are designed to obtain receptive fields of different scales. The experimental results show that our method has better performance while considering both object positioning and target classification.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering