MAS-SAM: Segment Any Marine Animal with Aggregated Features

Tianyu Yan,Zifu Wan,Xinhao Deng,Pingping Zhang,Yang Liu,Huchuan Lu

2024-05-09

Abstract:Recently, Segment Anything Model (SAM) shows exceptional performance in generating high-quality object masks and achieving zero-shot image segmentation. However, as a versatile vision model, SAM is primarily trained with large-scale natural light images. In underwater scenes, it exhibits substantial performance degradation due to the light scattering and absorption. Meanwhile, the simplicity of the SAM's decoder might lead to the loss of fine-grained object details. To address the above issues, we propose a novel feature learning framework named MAS-SAM for marine animal segmentation, which involves integrating effective adapters into the SAM's encoder and constructing a pyramidal decoder. More specifically, we first build a new SAM's encoder with effective adapters for underwater scenes. Then, we introduce a Hypermap Extraction Module (HEM) to generate multi-scale features for a comprehensive guidance. Finally, we propose a Progressive Prediction Decoder (PPD) to aggregate the multi-scale features and predict the final segmentation results. When grafting with the Fusion Attention Module (FAM), our method enables to extract richer marine information from global contextual cues to fine-grained local details. Extensive experiments on four public MAS datasets demonstrate that our MAS-SAM can obtain better results than other typical segmentation methods. The source code is available at

Computer Vision and Pattern Recognition,Robotics

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the accurate segmentation of marine animals in complex underwater environments. Specifically, although the existing Segment Anything Model (SAM) performs excellently in natural - light scenes, its performance drops significantly in underwater scenes due to problems such as light scattering and absorption leading to decreased image quality, reduced contrast, and object blurring. In addition, the decoder structure of SAM is relatively simple, which may lead to the loss of fine - grained object details. These problems make SAM face challenges when dealing with the marine - animal - segmentation task. To address these challenges, the paper proposes a new feature - learning framework named MAS - SAM, which is specifically optimized for the marine - animal - segmentation task. MAS - SAM improves SAM in the following ways: 1. **Adapter - informed SAM Encoder (ASE)**: By introducing effective adapters, the encoder of SAM is improved so that it can extract unique features from marine - animal images. 2. **Hypermap Extraction Module (HEM)**: Generate multi - scale feature maps to provide comprehensive guidance for the subsequent mask - prediction process. 3. **Progressive Prediction Decoder (PPD)**: By gradually aggregating multi - source features from the original prompts, ASE, and HEM, the representational ability of the decoder is improved, capturing a wide range of information from the global context to fine - grained local details. These improvements enable MAS - SAM to achieve better results on four publicly available marine - animal - segmentation datasets than other typical segmentation methods.

MAS-SAM: Segment Any Marine Animal with Aggregated Features

Fantastic Animals and Where to Find Them: Segment Any Marine Animal with Dual SAM

An Image Segmentation Method Based on Transformer and Multi-Scale Feature Fusion for UAV Marine Environment Monitoring

SAM Fails to Segment Anything? – SAM-Adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, Medical Image Segmentation, and More

SAM-Adapter: Adapting Segment Anything in Underperformed Scenes

Segment Anything with Multiple Modalities

Marine Animal Segmentation

Adapting SAM for Underwater Object Segmentation

MASNet: A Robust Deep Marine Animal Segmentation Network

AquaSAM: Underwater Image Foreground Segmentation

SAMP: Adapting Segment Anything Model for Pose Estimation

MeSAM: Multiscale Enhanced Segment Anything Model for Optical Remote Sensing Images

WaterSAM: Adapting SAM for Underwater Object Segmentation

Multi-Scale and Detail-Enhanced Segment Anything Model for Salient Object Detection

Tuning a SAM-Based Model with Multi-Cognitive Visual Adapter to Remote Sensing Instance Segmentation

MSGNet: Multi-Source Guidance Network for Fish Segmentation in Underwater Videos

Semantic-SAM: Segment and Recognize Anything at Any Granularity

AM-SAM: Automated Prompting and Mask Calibration for Segment Anything Model

SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation

MW‐SAM:Mangrove wetland remote sensing image segmentation network based on segment anything model

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything