Rethinking the Crop Row Detection Pipeline: an End-to-end Method for Crop Row Detection Based on Row-Column Attention

Boliao Li,Dongfang Li,Zhenbo Wei,Jun Wang
DOI: https://doi.org/10.1016/j.compag.2024.109264
IF: 8.3
2024-01-01
Computers and Electronics in Agriculture
Abstract:Vision-based autonomous navigation technology is vital important for unmanned driving of agricultural machinery and precise operation. Crop row detection, a fundamental task of vision-based navigation, has a significant impact on automatic navigation. The current pipeline for crop row detection is cumbersome and requires multiple steps with manually adjusted parameters, which limits the accuracy, speed, and robustness of crop row detection. This study proposed an end-to-end deep neural network based on row-column attention to simplify the pipeline of crop row detection. The network can directly output the coordinate representation of crop rows in the image without additional complex post-processing and manual parameter adjustment steps. In the model design, crop row was defined as a collection of points distributed in the image. The row-column attention mechanism, used in the transformer-based encoder-decoder network, was proposed to scratch the row and column features, which could help generate more detailed crop rows. The LineIoU loss and Line Location loss were designed to make the network focus on the position and shape information of the crop rows, which can enhance response speed, precision, stability of the proposed network. Experiments on tea and vegetable dataset showed that the model achieved 95.75 % accuracy and the average lateral distance between prediction and ground truth was 8.8 pixel. Compared with the existing SOTA model, the accuracy of the model is improved 8.5 % and the average running time of processing a 1920 x 1080 size image is 23.81 ms.
What problem does this paper attempt to address?