AABLSTM: A Novel Multi-task Based CNN-RNN Deep Model for Fashion Analysis.

Xianlin Zhang,Mengling Shen,Xueming Li,Xiaojie Wang
DOI: https://doi.org/10.1145/3519029
2023-01-01
Abstract:With the rapid growth of online commerce and fashion-related applications, visual clothing analysis and recognition has become a hotspot in computer vision. In this paper, we propose a novel AABLSTM network, which is based on deep CNN-RNN, to solve the visual fashion analysis of clothing category classification, attribute detection, and landmark localization. The designed fashion model is leveraged with the multi-task driven mechanism as follows: firstly, a bidirectional LSTM (Bi-LSTM) branch is proposed for efficiently mining the semantic association between related attributes so as to improve the precision of clothing category classification and attribute detection; then, an imitated hourglass sub-network of “down-up sampling” is constructed for boosting the accuracy of fashion landmark localization; and finally, a specially designed multi-loss function is constructed to better optimize the network training. Extensive experimental results on large-scale fashion datasets demonstrate the superior performance of our approach.
What problem does this paper attempt to address?