TomatoDIFF: On-plant Tomato Segmentation with Denoising Diffusion Models

Marija Ivanovska,Vitomir Struc,Janez Pers
2023-07-03
Abstract:Artificial intelligence applications enable farmers to optimize crop growth and production while reducing costs and environmental impact. Computer vision-based algorithms in particular, are commonly used for fruit segmentation, enabling in-depth analysis of the harvest quality and accurate yield estimation. In this paper, we propose TomatoDIFF, a novel diffusion-based model for semantic segmentation of on-plant tomatoes. When evaluated against other competitive methods, our model demonstrates state-of-the-art (SOTA) performance, even in challenging environments with highly occluded fruits. Additionally, we introduce Tomatopia, a new, large and challenging dataset of greenhouse tomatoes. The dataset comprises high-resolution RGB-D images and pixel-level annotations of the fruits.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper aims to address the following issues: 1. **Propose a new model**: The paper introduces a new method based on the denoising diffusion model—TomatoDIFF, for the semantic segmentation task of greenhouse tomatoes. Compared to traditional computer vision algorithms, this model demonstrates stronger performance in segmenting highly occluded tomatoes in complex environments. 2. **Introduce a new dataset**: To validate the effectiveness of the model, the researchers also created a new large-scale greenhouse tomato dataset—Tomatopia. This dataset includes high-resolution RGB-D images and pixel-level annotations, aiming to address the shortcomings of currently publicly available datasets and improve the comparability and reproducibility of results. 3. **Improve detection accuracy**: Addressing the difficulties existing deep learning methods face in detecting highly occluded or distant targets, TomatoDIFF enhances segmentation accuracy by combining RGB images, noise ground truth masks, and pre-trained feature maps. It performs exceptionally well, especially in handling tomatoes in densely planted environments. 4. **Evaluation and comparison**: The paper conducts a comprehensive evaluation of TomatoDIFF and compares it with other state-of-the-art models (such as Mask R-CNN, YOLACT, etc.), demonstrating the superior performance of this method on two different datasets.