Abstract:The goal of moving object segmentation is separating moving objects from stationary backgrounds in videos. One major challenge in this problem is how to develop a universal model for videos from various natural scenes since previous methods are often effective only in specific scenes. In this paper, we propose a method called Learning Temporal Distribution and Spatial Correlation (LTS) that has the potential to be a general solution for universal moving object segmentation. In the proposed approach, the distribution from temporal pixels is first learned by our Defect Iterative Distribution Learning (DIDL) network for a scene-independent segmentation. Notably, the DIDL network incorporates the use of an improved product distribution layer that we have newly derived. Then, the Stochastic Bayesian Refinement (SBR) Network, which learns the spatial correlation, is proposed to improve the binary mask generated by the DIDL network. Benefiting from the scene independence of the temporal distribution and the accuracy improvement resulting from the spatial correlation, the proposed approach performs well for almost all videos from diverse and complex natural scenes with fixed parameters. Comprehensive experiments on standard datasets including LASIESTA, CDNet2014, BMC, SBMI2015 and 128 real world videos demonstrate the superiority of proposed approach compared to state-of-the-art methods with or without the use of deep learning networks. To the best of our knowledge, this work has high potential to be a general solution for moving object segmentation in real world environments. The code and real-world videos can be found on GitHub <a class="link-external link-https" href="https://github.com/guanfangdong/LTS-UniverisalMOS" rel="external noopener nofollow">this https URL</a>.

Leveraging Motion Priors in Videos for Improving Human Segmentation

Adaptive Hierarchical Motion-Focused Model for Video Prediction.

Saliency Flow Based Video Segmentation Via Motion Guided Contour Refinement.

Unsupervised Online Video Object Segmentation with Motion Property Understanding

MoDA: Leveraging Motion Priors from Videos for Advancing Unsupervised Domain Adaptation in Semantic Segmentation

PriorMotion: Generative Class-Agnostic Motion Prediction with Raster-Vector Motion Field Priors

Segmenting the motion components of a video: A long-term unsupervised model

Zero-Shot Monocular Motion Segmentation in the Wild by Combining Deep Learning with Geometric Motion Model Fusion

Dense Monocular Motion Segmentation Using Optical Flow and Pseudo Depth Map: A Zero-Shot Approach

Learning Motion Priors for 4D Human Body Capture in 3D Scenes

Self-supervised Video Object Segmentation Using Motion Feature Compensation

Human Motion Transfer from Poses in the Wild

Motion-state Alignment for Video Semantic Segmentation

Every Frame Counts: Joint Learning of Video Segmentation and Optical Flow

Learning to Segment Human by Watching YouTube

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance

Using Diffusion Priors for Video Amodal Segmentation

Domain Adaptative Video Semantic Segmentation Via Motion-Guided Domain Bridge

Learning Temporal Distribution and Spatial Correlation Towards Universal Moving Object Segmentation

Learning Temporal Distribution and Spatial Correlation Toward Universal Moving Object Segmentation

Decomposed Human Motion Prior for Video Pose Estimation via Adversarial Training