M+Mnet: A Multibranch Network with Mixed Precision Training for Image Aesthetics Assessment

Shuai He,Xuejing Kang,Zhiyu Zhang,Anlong Ming
DOI: https://doi.org/10.2139/ssrn.4387004
2023-01-01
Abstract:We present a network design to compensate for the neglect of background information in current image aesthetics assessment (IAA) methods while providing strategies for overcoming the dilemma between training costs and performance. The method, referred to as M+MNet, is a multibranch and mixed precision network. In our network design, through the addition of two exponentially weighted pooling methods, M+MNet can selectively boost the extraction of either background or salient information during the downsampling procedure. We also propose an unsupervised data augmentation method named Corner Grid, which leverages the diffusive characteristics of convolution to force the network to seek more relevant background information. In our training strategy, we perform mixed precision training by switching the precision format, thus significantly reducing the time and memory consumption of data representation and transmission. Most of our methods specifically designed for IAA tasks have been proven to be generalizable to other IAA works. For performance verification, we develop a large-scale benchmark (the most comprehensive thus far) by comparing 17 methods with M+MNet on two representative datasets: AVA and FLICKR-AES. M+MNet achieves state-of-the-art performances on all tasks. The source code will be available in the link.
What problem does this paper attempt to address?