Highly Efficient Natural Image Matting

Yijie Zhong,Bo Li,Lv Tang,Hao Tang,Shouhong Ding
DOI: https://doi.org/10.48550/arXiv.2110.12748
2021-10-25
Abstract:Over the last few years, deep learning based approaches have achieved outstanding improvements in natural image matting. However, there are still two drawbacks that impede the widespread application of image matting: the reliance on user-provided trimaps and the heavy model sizes. In this paper, we propose a trimap-free natural image matting method with a lightweight model. With a lightweight basic convolution block, we build a two-stages framework: Segmentation Network (SN) is designed to capture sufficient semantics and classify the pixels into unknown, foreground and background regions; Matting Refine Network (MRN) aims at capturing detailed texture information and regressing accurate alpha values. With the proposed cross-level fusion Module (CFM), SN can efficiently utilize multi-scale features with less computational cost. Efficient non-local attention module (ENA) in MRN can efficiently model the relevance between different pixels and help regress high-quality alpha values. Utilizing these techniques, we construct an extremely light-weighted model, which achieves comparable performance with ~1\% parameters (344k) of large models on popular natural image matting benchmarks.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problems that this paper attempts to solve mainly focus on two major challenges in the field of Natural Image Matting: 1. **Getting rid of the dependence on the trimap provided by users**: Traditional natural image matting methods usually require the trimap provided by users as an auxiliary input, which is not friendly to novice users without professional knowledge of digital matting. The trimap is a rough image segmentation, dividing the input image into foreground (\(\alpha_i = 1\)), background (\(\alpha_i = 0\)) and unknown area (\(\alpha_i\in(0, 1)\)). However, this dependence limits the application of natural image matting technology in a wider range of scenarios. 2. **Reducing the volume and computational burden of the model**: Existing natural image matting methods often use large - scale models, which consume a large amount of resources and are difficult to run on low - power consumption devices with limited storage and computing capabilities. This also limits the practical application scope of these methods. In order to meet the above challenges, the paper proposes a trimap - free and lightweight natural image matting method. Specifically, this method includes a two - stage framework: - **The first stage (Segmentation Network, SN)**: It is designed to capture sufficient semantic information and classify pixels into unknown areas, foreground and background. By introducing the Cross - Level Fusion Module (CFM), SN can efficiently use multi - scale features and reduce computational costs. - **The second stage (Matting Refine Network, MRN)**: It aims to capture detailed texture information and regress accurate alpha values. Through the Efficient Non - Local Attention (ENA), MRN can effectively model the correlation between different pixels and help regress high - quality alpha values. Through these techniques, the paper constructs an extremely lightweight model, whose number of parameters is only 1% of that of existing large - scale models, but it can still achieve comparable performance in popular natural image matting benchmark tests.