Abstract:The process of estimating and counting tree density using only a single aerial or satellite image is a difficult task in the fields of photogrammetry and remote sensing. However, it plays a crucial role in the management of forests. The huge variety of trees in varied topography severely hinders tree counting models to perform well. The purpose of this paper is to propose a framework that is learnt from the source domain with sufficient labeled trees and is adapted to the target domain with only a limited number of labeled trees. Our method, termed as AdaTreeFormer, contains one shared encoder with a hierarchical feature extraction scheme to extract robust features from the source and target domains. It also consists of three subnets: two for extracting self-domain attention maps from source and target domains respectively and one for extracting cross-domain attention maps. For the latter, an attention-to-adapt mechanism is introduced to distill relevant information from different domains while generating tree density maps; a hierarchical cross-domain feature alignment scheme is proposed that progressively aligns the features from the source and target domains. We also adopt adversarial learning into the framework to further reduce the gap between source and target domains. Our AdaTreeFormer is evaluated on six designed domain adaptation tasks using three tree counting datasets, \ie Jiangsu, Yosemite, and London. Experimental results show that AdaTreeFormer significantly surpasses the state of the art, \eg in the cross domain from the Yosemite to Jiangsu dataset, it achieves a reduction of 15.9 points in terms of the absolute counting errors and an increase of 10.8\% in the accuracy of the detected trees' locations. The codes and datasets are available at <a class="link-external link-https" href="https://github.com/HAAClassic/AdaTreeFormer" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The paper aims to address the problem of tree density estimation and counting from a single high-resolution aerial or satellite image, especially in cases where there are significant variations in different geographical environments and tree species. The paper proposes a new framework called AdaTreeFormer, which is a few-shot domain adaptation method for the tree counting task. Specifically, the paper addresses the following key issues: 1. **Cross-domain Adaptability**: In different geographical environments (such as urban, rural, farmland, etc.) and different image types (such as aerial images or satellite images), existing tree counting models often struggle to adapt well to new environments because they require a large amount of annotated data for training. Therefore, the researchers propose a method that can learn from a source domain with sufficient annotated data and adapt to a target domain (where only a small amount of annotated data is available). 2. **Adaptation Using Attention Mechanism**: To improve the model's performance in the target domain, the paper proposes an "Attention-to-Adapt" mechanism, which extracts relevant features through self-attention and cross-domain attention, helping the model better understand the differences between domains. 3. **Hierarchical Cross-domain Feature Alignment**: To further ensure that features from the source and target domains can be effectively aligned, the researchers introduce a hierarchical cross-domain feature alignment scheme, gradually aligning features from both domains. 4. **Adversarial Training**: By adopting an adversarial training strategy, the method in the paper can further reduce the gap between the source and target domains, thereby generating more consistent tree density maps. In summary, the goal of AdaTreeFormer is to achieve effective domain adaptation for the tree counting task with only a small amount of annotated data in the target domain, thereby improving the accuracy and generalization ability of tree counting. The method has been evaluated on multiple designed domain adaptation tasks and has shown significantly better results than existing techniques on three different tree counting datasets.

AdaTreeFormer: Few Shot Domain Adaptation for Tree Counting from a Single High-Resolution Image

TreeFormer: a Semi-Supervised Transformer-based Framework for Tree Counting from a Single High Resolution Image

Background-Aware Domain Adaptation for Plant Counting

Tree Counting by Bridging 3D Point Clouds with Imagery

ShadowSense: Unsupervised Domain Adaptation and Feature Fusion for Shadow-Agnostic Tree Crown Detection from RGB-Thermal Drone Imagery

Unsupervised Domain Adaptation For Plant Organ Counting

Cross-regional oil palm tree counting and detection via a multi-level attention domain adaptation network

Transformer for Tree Counting in Aerial Images

Joint Distribution Adaptive-Alignment for Cross-Domain Segmentation of High-Resolution Remote Sensing Images

VrsNet - density map prediction network for individual tree detection and counting from UAV images

Benchmarking Anchor-Based and Anchor-Free State-of-the-Art Deep Learning Methods for Individual Tree Detection in RGB High-Resolution Images

Cross Domain Adaptation of Crowd Counting with Model-Agnostic Meta-Learning

Cross-regional oil palm tree counting and detection via multi-level attention domain adaptation network

Density-Insensitive Unsupervised Domain Adaption on 3D Object Detection

Tree Detection and Diameter Estimation Based on Deep Learning

Adaptive Mean Shift-Based Identification of Individual Trees Using Airborne LiDAR Data

Deep Learning Enables Image-Based Tree Counting, Crown Segmentation, and Height Prediction at National Scale.

Individual tree detection and counting based on high-resolution imagery and the canopy height model data

Camphor tree detection in urban environments using RGB-DSM data fusion

Crowd counting via unsupervised cross-domain feature adaptation

Discontinuous synthesis of both strands at the growing fork during polyoma DNA replication in vitro