Abstract:The performance of small-object detection in UAV aerial imagery is highly susceptible to degradation caused by irregular shooting angles, viewpoint variations, occlusions, and illumination changes. To address these issues, we propose LDRT-DETR, a lightweight object detection model tailored for UAV scenarios. To mitigate the pronounced scale discrepancy between small objects and their backgrounds, a multi-scale edge feature extraction module is introduced to capture features across different scales and enhance edge information, thereby improving the model’s ability to perceive small objects. A context-enhanced feature fusion strategy is designed to more effectively integrate contextual cues and reduce detail loss during feature fusion, leading to improved detection accuracy. Additionally, to cope with complex backgrounds and severe occlusions in UAV aerial images, a multi-scale sampling module is developed. This module incorporates both upsampling and downsampling mechanisms and constructs diverse feature extraction pathways through a multi-branch structure, thereby enriching feature representation. Experimental results on the VisDrone2019 dataset demonstrate that LDRT-DETR achieves an mAP50 of 50.05%, surpassing RT-DETR by 2.82%, while reducing the number of parameters and GFLOPs by 24.7% and 10%, respectively.