Underwater Fish Image Recognition Based on Detection Transformer

Zhuowei Wang,Zhukang Ruan,Chong Chen
DOI: https://doi.org/10.3390/jmse12060864
IF: 2.744
2024-05-23
Journal of Marine Science and Engineering
Abstract:Due to the complexity of underwater environments and the lack of training samples, the application of target detection algorithms to the underwater environment has yet to provide satisfactory results. It is crucial to design specialized underwater target recognition algorithms for different underwater tasks. In order to achieve this goal, we created a dataset of freshwater fish captured from multiple angles and lighting conditions, aiming to improve underwater target detection of freshwater fish in natural environments. We propose a method suitable for underwater target detection, called DyFish-DETR (Dynamic Fish Detection with Transformers). In DyFish-DETR, we propose a DyFishNet (Dynamic Fish Net) to better extract fish body texture features. A Slim Hybrid Encoder is designed to fuse fish body feature information. The results of ablation experiments show that DyFishNet can effectively improve the mean Average Precision (mAP) of model detection. The Slim Hybrid Encoder can effectively improve Frame Per Second (FPS). Both DyFishNet and the Slim Hybrid Encoder can reduce model parameters and Floating Point Operations (FLOPs). In our proposed freshwater fish dataset, DyFish-DETR achieved a mAP of 96.6%. The benchmarking experimental results show that the Average Precision (AP) and Average Recall (AR) of DyFish-DETR are higher than several state-of-the-art methods. Additionally, DyFish-DETR, respectively, achieved 99%, 98.8%, and 83.2% mAP in other underwater datasets.
engineering, ocean,oceanography, marine
What problem does this paper attempt to address?
The paper mainly addresses the challenges of underwater fish image recognition, especially due to the complexity of the underwater environment and the lack of training samples, which leads to unsatisfactory performance of target detection algorithms in underwater applications. The researchers created a freshwater fish image dataset containing images with various angles and lighting conditions to improve underwater target detection of freshwater fish in natural environments. They proposed a method called DyFish-DETR, which is a dynamic fish detection method based on detection transformers. DyFish-DETR includes DyFishNet for better extraction of fish body texture features and Slim Hybrid Encoder for merging fish feature information and improving frame rate. The experiments showed that DyFishNet can effectively improve the average precision (mAP) of the model detection, while Slim Hybrid Encoder can improve the frame rate, reduce model parameters, and floating point operations (FLOPs). On the proposed freshwater fish dataset, DyFish-DETR achieved an mAP of 96.6% and outperformed some state-of-the-art methods on other underwater datasets as well. The goal of the paper is to design an algorithm specifically for underwater target recognition, especially for the detection of freshwater fish, to address challenges such as similar fish features and body distortions.