Multiscale Kiwifruit Detection from Digital Images.

Yi Xia,Minh Nguyen,Raymond Lutui,Wei Qi Yan
DOI: https://doi.org/10.1007/978-981-97-0376-0_7
2024-01-01
Abstract:In this paper, we propose an improved YOLOv8-based Kiwifruit detection method using Swin Transformer, aiming to address challenges posed by significant scale variation and inaccuracies in multiscale object detection. Specifically, our approach embeds the encoder from Swin Transformer, based on its sliding-window design, into the YOLOv8 architecture to capture contextual information and global dependencies of the detected objects at multiple scales, facilitating the learning of semantic features. Through comparative experiments with the state-of-the-art object detection algorithms on our collected dataset, our proposed method demonstrates efficient detection of objects at different scales, significantly reducing false negatives while im-proving precision. Moreover, the method proves to be versatile in detecting objects of various sizes in different environmental settings, fulfilling the real-time requirements in complex and unknown Kiwifruit cultivation scenarios. The results highlight the potential practical applications of the pro-posed approach in Kiwifruit industry, showcasing its suitability for addressing real-world challenges and complexities.
What problem does this paper attempt to address?