Robust Fish Recognition Using Foundation Models toward Automatic Fish Resource Management

Tatsuhito Hasegawa,Daichi Nakano
DOI: https://doi.org/10.3390/jmse12030488
IF: 2.744
2024-03-15
Journal of Marine Science and Engineering
Abstract:Resource management for fisheries plays a pivotal role in fostering a sustainable fisheries industry. In Japan, resource surveys rely on manual measurements by staff, incurring high costs and limitations on the number of feasible measurements. This study endeavors to revolutionize resource surveys by implementing image-recognition technology. Our methodology involves developing a system that detects individual fish regions in images and automatically identifies crucial keypoints for accurate fish length measurements. We use grounded-segment-anything (Grounded-SAM), a foundation model for fish instance segmentation. Additionally, we employ a Mask Keypoint R-CNN trained on the fish image bank (FIB), which is an original dataset of fish images, to accurately detect significant fish keypoints. Diverse fish images were gathered for evaluation experiments, demonstrating the robust capabilities of the proposed method in accurately detecting both fish regions and keypoints.
oceanography,engineering, marine, ocean
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to achieve robust fish recognition using Foundation Models, thereby promoting the automation of fishery resource management. Specifically, the researchers attempt to address the following key issues: 1. **High Cost and Low Efficiency of Manual Measurement**: - Currently, in Japan, fishery resource surveys mainly rely on manual measurements by staff, which is not only time-consuming and costly but also limits the feasible number of measurements. 2. **Need for Automated Resource Surveys**: - To improve the accuracy and efficiency of resource surveys, it is crucial to introduce new technologies and methods. The researchers have developed a system that can automatically detect fish regions through image recognition technology and accurately identify key points for measuring fish length. 3. **Challenges of Multi-Species and Complex Background Recognition**: - In practical applications, the caught fish are often numerous and the background is complex, which poses high requirements for the robustness of the recognition system. The researchers ensure the system's effectiveness and accuracy in various environments by using foundation models and custom datasets. 4. **Error Detection and Handling**: - Although image recognition methods have high detection accuracy, achieving 100% detection remains a challenge. Therefore, the researchers introduced an error detection function to ensure that only reliable detection data is used for resource surveys. ### Solution Overview The researchers propose a robust fish recognition method based on foundation models, which mainly includes the following steps: 1. **Instance Segmentation**: - Using the Grounded-SAM model to detect each fish region in the image. 2. **Keypoints Detection**: - Using the Mask Keypoint R-CNN model to detect key points of each fish, which are used for accurate fish length measurement. 3. **Species Identification**: - Although this part is not discussed in detail in the paper, it can be achieved through other techniques such as the authors' previous research, Fish-Pak, and WildFish methods. 4. **Length Calculation**: - Using the coordinates of key points and depth information, the fish length is calculated through Euclidean distance. 5. **Error Check**: - Introducing a model to analyze the detection results to ensure that only reliable data is used for resource surveys. ### Main Contributions 1. **Instance Segmentation Model**: - Optimized the Grounded-SAM model through prompt engineering, making it suitable for resource surveys in various environments. 2. **Keypoints Detection and Error Check Model**: - Proposed robust keypoints detection and error check models and demonstrated their effectiveness through validation. 3. **Fish Image Bank (FIB) Dataset**: - Created and released the FIB dataset, a new benchmark dataset containing 405 4K resolution fish images, each annotated with fish species, region masks, and key points. Through these methods and contributions, the researchers hope to promote the widespread application of automated resource surveys, improving the efficiency and accuracy of fishery resource management.