Multi-Scale Apple Tree Crown Instance Segmentation from UAV Orthomosaics via Enhanced Mask R-CNN and DBSCAN-based Clustering
Description
Accurate orchard-scale delineation of individual apple tree crowns from UAV orthomosaics is difficult in late fall due to color shifts, partial defoliation, shadows, and frequent crown overlap. We propose a two-stage pipeline that couples an enhanced Mask R-CNN with DBSCAN-based post-processing to generate a topology-consistent crown map for an entire orchard. The baseline Mask R-CNN is strengthened by a transformer-augmented ResNet-50 backbone and a U-Net decoder, allowing simultaneous modeling of long-range context and recovery of fine boundary details for small and densely packed crowns. For large orthomosaics, overlapping tile inference can create duplicate, fragmented, or truncated instances; therefore, we cluster predicted masks using centroid-based DBSCAN and apply multi-round IoU- and confidence-based filtering to select one representative mask per tree and to merge tile outputs. Using UAV RGB imagery collected on 7 November 2024 in Quebec, Canada, the proposed model achieved precision 0.897, recall 0.880, and F1-score 0.889, outperforming standard Mask R-CNN and YOLOv8 on the test set. The post-processing step reliably removed tiling artifacts and produced georeferenced per-tree polygons and centroids for tree counting, crown area estimation, and spatial decision support. The proposed workflow provides a practical and reproducible approach for orchard inventory and precision management from UAV orthomosaics.
Files
Institutions
- McGill UniversityQuebec, Montreal