Recognition and calculation of objects in images using YOLOv3 architecture

Hrabovskyi V,Kmet O,,
DOI: https://doi.org/10.15407/jai2021.02.042
IF: 14.4
2021-12-01
Artificial Intelligence
Abstract:Program that searches for five types of fruits in the images of fruit trees, classifies them and counts their quantity is presented. Its creation took into account the requirement to be able to work both in the background and in real time and to identify the desired objects at a sufficiently high speed. The program should also be able to learn from available computers (including laptops) and within a reasonable time. In carrying out this task, the possibilities of several existing approaches to the recognition and identification of visual objects based on the use of convolutional neural networks were analyzed. Among the considered network archi-tectures were R-CNN, Fast R-CNN, Faster R-CNN, SSD, YOLO and some modifications based on them. Based on the analysis of the peculiarities of their work, the YOLO architecture was used to perform the task, which allows the analy-sis of visual objects in real time with high speed and reliability. The software product was implemented by modifying the YOLOv3 architecture implemented in TensorFlow 2.1. Object recognition in this architecture is performed using a trained Darknet-53 network, the parameters of which are freely available. The modification of the network was to replace its original classification layer. The training of the network modified in this way was carried out on the basis of Transfer learning technology using the Agrilfruit Dataset. There was also a study of the peculiarities of the learning process of the network under the use of different types of gradient descent (stochastic and with the value of the batch 4 and 8), as a result of which the optimal version of the trained network weights was selected for further use. Tests of the modified and trained network have shown that the system based on it with high reliability distin-guishes objects of the corresponding classes of different sizes in the image (even with their significant masking) and counts their number. The ability of the program to distinguish and count the number of individual fruits in the analyzed image can be used to visually assess the yield of fruit trees
computer science, artificial intelligence
What problem does this paper attempt to address?