A Novel FCNs‐ConvLSTM Network for Video Salient Object Detection

Hai Huang,Chang Liu,Lei Tian,Junsheng Mu,Xiaojun Jing
DOI: https://doi.org/10.1002/cta.2924
IF: 2.378
2021-01-01
International Journal of Circuit Theory and Applications
Abstract:SummaryA video saliency detection model is proposed based on deep learning, which improves the existing fully convolutional network (FCN)‐based model by introducing a convolutional long short‐term memory (ConvLSTM) module. The ConvLSTM splits the input into two flows with two layers in each one. The two flows have different dilation rates that make them have different receptive fields, which enables the proposed model to perform better in depicting the contour of objects. The ConvLSTM module receive frames in order as input rather than unordered frames that FCN modules do, so the proposed model can learn both spatial and temporal information of video data. Considering the lack of manually labeled annotations in the dataset, augmentation technologies are used in training the model to expand the dataset, such as performing mirror transformation, introducing Gaussian noise and abandoning every other frame to simulate fast movement situation. The proposed FCNs‐ConvLSTM model is trained and evaluated on extensively used dataset, and the results demonstrate that it performs better on recall rate (0.52 to 0.64) with a similar level on precision rate (0.72) when threshold is 125 and it also gets an increase on maximum F‐measure (0.66 to 0.70), which indicates that the proposed model has better capacity in detecting moving object.
What problem does this paper attempt to address?