4D Metric-Semantic Mapping for Persistent Orchard Monitoring: Method and Dataset

Jiuzhou Lei,Ankit Prabhu,Xu Liu,Fernando Cladera,Mehrad Mortazavi,Reza Ehsani,Pratik Chaudhari,Vijay Kumar
2024-09-30
Abstract:Automated persistent and fine-grained monitoring of orchards at the individual tree or fruit level helps maximize crop yield and optimize resources such as water, fertilizers, and pesticides while preventing agricultural waste. Towards this goal, we present a 4D spatio-temporal metric-semantic mapping method that fuses data from multiple sensors, including LiDAR, RGB camera, and IMU, to monitor the fruits in an orchard across their growth season. A LiDAR-RGB fusion module is designed for 3D fruit tracking and localization, which first segments fruits using a deep neural network and then tracks them using the Hungarian Assignment algorithm. Additionally, the 4D data association module aligns data from different growth stages into a common reference frame and tracks fruits spatio-temporally, providing information such as fruit counts, sizes, and positions. We demonstrate our method's accuracy in 4D metric-semantic mapping using data collected from a real orchard under natural, uncontrolled conditions with seasonal variations. We achieve a 3.1 percent error in total fruit count estimation for over 1790 fruits across 60 apple trees, along with accurate size estimation results with a mean error of 1.1 cm. The datasets, consisting of LiDAR, RGB, and IMU data of five fruit species captured across their growth seasons, along with corresponding ground truth data, will be made publicly available at: <a class="link-external link-https" href="https://4d-metric-semantic-mapping.org/" rel="external noopener nofollow">this https URL</a>
Robotics
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the issue of continuous, fine-grained monitoring of fruit growth in orchards. Specifically, by automating the monitoring at the level of individual trees or fruits, it can maximize crop yield, optimize the use of resources such as water, fertilizers, and pesticides, and prevent agricultural waste. To this end, the authors propose a 4D spatiotemporal metric-semantic mapping method that integrates data from multiple sensors (including LiDAR, RGB cameras, and IMU) to monitor fruits in the orchard throughout the growing season. ### Main Contributions 1. **Method**: A complete framework is proposed that can handle LiDAR-visual-inertial measurement data across multiple time sessions and construct a 4D metric-semantic map while estimating the number and size of fruits. The framework includes spatial and temporal data association modules to track fruits and integrate information from multiple sensors and time sessions. 2. **Experiments**: Experiments were conducted in a real orchard under natural conditions, demonstrating the framework's ability to track and monitor fruit growth over time. 3. **Dataset**: A dataset containing multimodal measurement data is publicly released, covering multiple growth stages, as well as instance segmentation annotations, ground truth values for the number and size of fruits needed for pre-trained models and benchmarking. ### Method Overview 1. **Spatial (3D) Tracking**: - **Fruit Instance Point Cloud Extraction**: Fruit instance point clouds are obtained by fusing LiDAR and RGB data. First, the images are preprocessed, then the YOLOv8 model is used for instance segmentation, and the Faster-LIO algorithm is used to estimate the sensor's pose. The point cloud is projected onto the image plane to obtain the instance point cloud of each fruit. - **Cross-Image Frame Tracking**: The Hungarian assignment algorithm is used to match fruits between consecutive frames. When calculating the matching cost, the depth of each fruit in the camera coordinate system and the corresponding camera pose are considered, using the mask Intersection over Union (IoU) as the matching cost function. - **Reprojection Error Minimization**: Depth estimation is improved by minimizing the reprojection error of fruit positions across all image frames, thereby optimizing fruit positions. 2. **Spatiotemporal (4D) Tracking**: - **Data Alignment**: The ICP algorithm is used to align point clouds from different time sessions, transforming them into a common reference frame. - **Fruit Association**: The Hungarian assignment algorithm is used to associate fruits across multiple time sessions based on a 3D Euclidean distance cost function. ### Experimental Results 1. **3D Fruit Tracking and Counting**: On 60 apple trees, the total count error was 3.1%, with an absolute total count error of 56 apples. Compared to the ground truth of 1790 apples, the algorithm estimated 1846 apples. The error mainly stemmed from point cloud noise of some fruits and miscounting of fallen fruits on the ground. 2. **4D Metric-Semantic Mapping and Fruit Size Estimation**: By manually matching fruits on five trees, the accuracy, recall, and F1 score of 4D data association were evaluated. The results showed accuracies of 75.96%, 88.89%, and 75.21%, recalls of 87.78%, 85.95%, and 84.25%, and F1 scores of 81.44%, 87.39%, and 79.47%. ### Conclusion The method proposed in this paper can accurately monitor fruit growth in orchards under natural conditions, providing farmers with timely and detailed information to optimize agricultural production. Additionally, the released dataset provides valuable resources for related research.