Abstract:In this paper, we propose an image matting framework called Salient Image Matting to estimate the per-pixel opacity value of the most salient foreground in an image. To deal with a large amount of semantic diversity in images, a trimap is conventionally required as it provides important guidance about object semantics to the matting process. However, creating a good trimap is often expensive and timeconsuming. The SIM framework simultaneously deals with the challenge of learning a wide range of semantics and salient object types in a fully automatic and an end to end manner. Specifically, our framework is able to produce accurate alpha mattes for a wide range of foreground objects and cases where the foreground class, such as human, appears in a very different context than the train data directly from an RGB input. This is done by employing a salient object detection model to produce a trimap of the most salient object in the image in order to guide the matting model about higher-level object semantics. Our framework leverages large amounts of coarse annotations coupled with a heuristic trimap generation scheme to train the trimap prediction network so it can produce trimaps for arbitrary foregrounds. Moreover, we introduce a multi-scale fusion architecture for the task of matting to better capture finer, low-level opacity semantics. With high-level guidance provided by the trimap network, our framework requires only a fraction of expensive matting data as compared to other automatic methods while being able to produce alpha mattes for a diverse range of inputs. We demonstrate our framework on a range of diverse images and experimental results show our framework compares favourably against state of art matting methods without the need for a trimap

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to automatically estimate the opacity value (i.e., alpha matte) of each pixel of the most salient foreground object in an image. Traditionally, in order to handle a large amount of semantic diversity in an image, a trimap is usually required to provide important guidance on object semantics to the matting process. However, creating a good trimap is usually expensive and time - consuming. The Salient Image Matting (SIM) framework proposed in the paper aims to overcome this challenge and achieve fully - automatic, end - to - end learning, which can handle a wide range of semantics and salient object types without a manually generated trimap. Specifically, the SIM framework generates a trimap of the most salient object in an image by using a salient object detection model, thereby guiding the matting model to understand high - level object semantics. In addition, the framework introduces a multi - scale fusion architecture to better capture fine - grained, low - level opacity semantics. Through this method, the SIM framework can generate accurate alpha mattes for various inputs with only a small amount of high - quality matting data, without a user - provided trimap. This enables the SIM framework to be comparable to existing interactive matting methods when dealing with diverse images from the real world, while performing excellently among fully - automatic methods.

Salient Image Matting

Semantic Image Matting

Semantic Image Matting: General and Specific Semantics

Disentangled Image Matting

A Saliency-Based Sampling Method for Image Matting.

AlphaNet: An Attention Guided Deep Network for Automatic Image Matting

Portrait Matting via Semantic and Detail Guidance.

Boosting General Trimap-free Matting in the Real-World Image

Weakly Supervised Image Matting Via Patch Clustering

Attention-guided Temporally Coherent Video Object Matting

Cascaded Segmented Matting Network for Human Matting

Matting Anything

Confidence-driven Image Co-Matting.

Multi-guided-based image matting via boundary detection

Text-Guided Portrait Image Matting

Coarse Semantic Guided Alpha Matting Via Simultaneous Foreground and Background Estimation

Lightweight Image Matting via Efficient Non-local Guidance.

Automatic Trimap Generation for Image Matting

Highly Efficient Natural Image Matting

Semantic-guided Automatic Natural Image Matting with Trimap Generation Network and Light-weight Non-local Attention

User-Guided Deep Human Image Matting Using Arbitrary Trimaps