Cross-modal Feature Fusion Mask R-CNN and Point Cloud Normalization Segmentation Transformation for Fish Length Estimation

Haoran Li,Xin Ma,Hanchi Liu
DOI: https://doi.org/10.1007/s10499-024-01610-4
2024-01-01
Aquaculture International
Abstract:Automatic fish length estimation is essential for modern aquaculture. Occlusion and body bended make accurate fish length estimation challenging in intensive aquaculture environments. Aiming at these issues, this study proposes a fish length estimation scheme based on cross-modal feature fusion Mask R-CNN (CMFF Mask R-CNN) and point cloud normalization segmentation transformation. To eliminate fish which are incomplete in binocular images due to occlusion and extract masks of fish which are complete in binocular images, a cross-modal feature fusion module is designed and embedded into Mask R-CNN to aggregate boundary features of fish from RGB and disparity into unified feature maps. The feature maps help remove incomplete fish and improve the accuracy of complete fish mask boundary. A fish length estimation algorithm based on point cloud normalization segmentation transformation is designed to reduce the length estimation error caused by bending. After plane and ellipse fitting transformation, the fish contour point cloud is then transformed into a unified space for K-means clustering segmentation. The sum of each segment is the fish length. Experimental results show that the mean relative error of the salmon length estimation is less than 5
What problem does this paper attempt to address?