DSFNet: Video Salient Object Detection Using a Novel Lightweight Deformable Separable Fusion Network

Hemraj Singh,Mridula Verma,Ramalingaswamy Cheruku
DOI: https://doi.org/10.1109/tim.2024.3470045
IF: 5.6
2024-10-19
IEEE Transactions on Instrumentation and Measurement
Abstract:Geometric variations of spatial and temporal features of objects in video streams cause great difficulty in video salient object detection (VSOD) tasks. Most existing deep-learning methods utilize fixed-sized kernels, which limits the receptive field (RF) to extract the local and global features and fails to understand the visual semantics of the deformed objects' foreground and background. Moreover, due to their complex architectures, these methods need more computational resources, which limits their deployment in real-world scenarios. To address the aforementioned challenges and to make a balance between performance and computational complexity, a deformable separable fusion network (DSFNet) is proposed, which extracts the geometric spatiotemporal variations at multiscale features dynamically without compromising the network's complexity. A Swarm-Enhanced Adam (SEAdam) optimizer has been proposed to adaptively balance the exploration and exploitation of gradients locally and globally and improve the convergence speed. This is the first work that extracts the multiscale geometric local and global context-based visual information. With the help of extensive experimentation on six benchmark highly challenging datasets, we show that the proposed model outperforms state-of-the-art models in terms of the number of parameters, floating-point operations (FLOPs), and latency.
engineering, electrical & electronic,instruments & instrumentation
What problem does this paper attempt to address?