Fish Detection and Classification Based on Improved ViT

Daxiong Ji,Ahmad Faraz Hussain,Sheharyar Hussain,Somadina Godwin Ogbonnaya,Songming Zhu,Xinwei Wang
DOI: https://doi.org/10.1109/icarce59252.2024.10492544
2023-01-01
Abstract:It is crucial for marine researchers and preservationists to routinely evaluate the relative abundance of fish species in their environments and monitor changes in their populations. Different automated computer-based fish sampling systems in underwater videos have been offered instead of arduous hand sampling. However, no ideal solution exists for automated fish identification and species classification. This is mainly because of the challenges involved in filming underwater, such as varying light levels, fish concealment, changing backgrounds, murky water, limited resolution, intricate backgrounds, fish shape distortions, obstruction from other water bodies, and subtle differences among specific fish species. The proposed method uses an improved vision transformer (IMViT) for detection and classification. To capture the inductive bias information from images, the traditional ViT model incorporates the convolutional and residual units of ResNet to improve the architecture of the ViT model when working with a small-scale dataset. Our results reveal that deep learning (DL) models can detect and classify species using underwater fish images and obtain considerable results. “Fish-Dataset” is used for evaluations. This method corroborates significantly more discrimination accuracy ameliorations than the previously proposed techniques. The proposed model may be used for various underwater objects in light of the experimental findings.
What problem does this paper attempt to address?