Fruit ripeness identification using transformers

Bingjie Xiao,Minh Nguyen,Wei Qi Yan
DOI: https://doi.org/10.1007/s10489-023-04799-8
IF: 5.3
2023-06-30
Applied Intelligence
Abstract:Pattern classification has always been essential in computer vision. Transformer paradigm having attention mechanism with global receptive field in computer vision improves the efficiency and effectiveness of visual object detection and recognition. The primary purpose of this article is to achieve the accurate ripeness classification of various types of fruits. We create fruit datasets to train, test, and evaluate multiple Transformer models. Transformers are fundamentally composed of encoding and decoding procedures. The encoder is to stack the blocks, like convolutional neural networks (CNN or ConvNet). Vision Transformer (ViT), Swin Transformer, and multilayer perceptron (MLP) are considered in this paper. We examine the advantages of these three models for accurately analyzing fruit ripeness. We find that Swin Transformer achieves more significant outcomes than ViT Transformer for both pears and apples from our dataset.
computer science, artificial intelligence
What problem does this paper attempt to address?