Small Object Detection Algorithm for UAV Aerial Images in Complex Environments

LIU Jie; LIU Shuhao; TIAN Ming; CUI Zhigang

doi:10.11999/JEIT251126

Volume 48 Issue 4

Apr. 2026

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2026 > 48(4): 1763-1773

LIU Jie, LIU Shuhao, TIAN Ming, CUI Zhigang. Small Object Detection Algorithm for UAV Aerial Images in Complex Environments[J]. Journal of Electronics & Information Technology, 2026, 48(4): 1763-1773. doi: 10.11999/JEIT251126

Citation:

LIU Jie, LIU Shuhao, TIAN Ming, CUI Zhigang. Small Object Detection Algorithm for UAV Aerial Images in Complex Environments[J]. Journal of Electronics & Information Technology, 2026, 48(4): 1763-1773. doi: 10.11999/JEIT251126

Citation:

PDF( 6050 KB)

Small Object Detection Algorithm for UAV Aerial Images in Complex Environments

doi: 10.11999/JEIT251126 cstr: 32379.14.JEIT251126

1.
School of Measurement and Communication Engineering, Harbin University of Science and Technology, Harbin 150080, China
2.
China Telecom Heilongjiang Branch, Harbin 150040, China
3.
Heilongjiang Provincial Highway Construction Center, Harbin 150001, China

Funds: The Natural Science Foundation of Heilongjiang Province (LH2023E086), The Science and Technology Project of Heilongjiang Provincial Communications Department (HJK2024B002)

Received Date: 2025-10-27
Accepted Date: 2026-01-22
Rev Recd Date: 2026-01-22

Available Online: 2026-02-11

Publish Date: 2026-04-10

Abstract

Abstract

Objective Small object detection is critical in applications such as UAV (Unmanned Aerial Vehicle) inspection and intelligent transportation systems, where accurate perception of diminutive targets is essential for operational reliability and safety. It supports automated identification and tracking of challenging targets. However, the limited pixel size of small objects, combined with frequent occlusion and background integration, introduces strong background noise and leads to poor performance and high false-negative rates in existing detection models. To address these issues and to achieve high-performance and high-precision detection of small objects in complex scenes, this study proposes HAR-DETR, an enhanced version of the RT-DETR baseline model, designed to improve detection accuracy for small objects. Methods HAR-DETR is designed for small object detection in aerial images and integrates three major improvements: Aggregated Attention, RFF-FPN (Recalibrated Feature Fusion Network-FPN), and a high-resolution detection branch. In the backbone, Aggregated Attention strengthens the model’s focus on relevant features of small objects. By expanding the receptive field, the model captures detailed edge and texture information, improving multi-scale feature extraction. During feature fusion, RFF-FPN selectively integrates high- and low-level features to retain critical spatial information and context. This supports better reconstruction of edges and contours of small objects and improves localization and recognition, particularly when object details are partially obscured by cluttered backgrounds or variable lighting. The high-resolution detection branch (HRDB) emphasizes edge features of small objects, enhancing perception and improving robustness and precision. Results and Discussions The model is compared with commonly used object detection models, including YOLOv5, YOLOv8, and YOLOv10, using precision, recall, and mAP metrics to assess performance in small object detection. Experimental results show that HAR-DETR outperforms the comparative models on the VisDrone2019 dataset (Table 1). The mAP₅₀ and mAP_50-95 increase by 3.8% and 3.2%, respectively, relative to the baseline model (Table 2). These results demonstrate superior detection performance in aerial images under complex conditions. GradCAM heatmaps are used for comparative analysis and show consistent improvements across all proposed components compared with the baseline model (Fig. 6). In the generalization experiment, the VisDrone2019 validation set and RSOD dataset are evaluated under identical training settings. The results confirm that HAR-DETR maintains strong generalization across heterogeneous tasks (Tables 3 and 4). Conclusions This work addresses false positives and false negatives in small object detection for aerial images captured in complex environments by using HAR-DETR. Aggregated Attention is used in the backbone to expand the receptive field and improve global feature extraction. During feature fusion, the RFF-FPN structure strengthens feature representation. A high-resolution detection head further increases sensitivity to edge textures of small objects. Evaluation on the VisDrone2019 and RSOD datasets shows: (1) mAP₅₀ and mAP_50-95 improve by 3.8% and 3.2%, respectively, reaching 51.2% and 32.1%, which reduces false negatives and false positives; (2) HAR-DETR outperforms mainstream object detection models, confirming its effectiveness; (3) the model achieves high accuracy in cross-dataset training, demonstrating strong generalization. These results show that HAR-DETR has stronger semantic representation and spatial awareness, adapts well to varied aerial perspectives and target distributions, and provides a more versatile solution for UAV visual perception in complex environments.
- Small object detection,
- RT-DETR,
- Feature fusion,
- Aerial images

FullText(HTML)

References(22)

References

[1]	张志豪, 杜丽霞, 侯越, 等. 跨层注意力交互下的多特征交叉无人机图像检测[J]. 光学精密工程, 2024, 32(24): 3616–3631. doi: 10.37188/OPE.20243224.3616. ZHANG Zhihao, DU Lixia, HOU Yue, et al. Multi-feature cross UAV image detection algorithm under cross-layer attentional interaction[J]. Optics and Precision Engineering, 2024, 32(24): 3616–3631. doi: 10.37188/OPE.20243224.3616.
[2]	孙叶美, 桑学婷, 张艳, 等. 基于超图计算的高效传递多尺度特征小目标检测算法[J]. 光电工程, 2025, 52(5): 250061. doi: 10.12086/oee.2025.250061. SUN Yemei, SANG Xueting, ZHANG Yan, et al. Hypergraph computed efficient transmission multi-scale feature small target detection algorithm[J]. Opto-Electronic Engineering, 2025, 52(5): 250061. doi: 10.12086/oee.2025.250061.
[3]	KONG Yaning, SHANG Xiangfeng, and JIA Shijie. Drone-DETR: Efficient small object detection for remote sensing image using enhanced RT-DETR model[J]. Sensors, 2024, 24(17): 5496. doi: 10.3390/s24175496.
[4]	李凯璇, 刘晓锋, 陈强, 等. YOLOv8-GAIS: 一种改进的无人机航拍目标检测算法[J]. 光电工程, 2025, 52(4): 240295. doi: 10.12086/oee.2025.240295. LI Kaixuan, LIU Xiaofeng, CHEN Qiang, et al. YOLOv8-GAIS: Improved object detection algorithm for UAV aerial photography[J]. Opto-Electronic Engineering, 2025, 52(4): 240295. doi: 10.12086/oee.2025.240295.
[5]	HUANG Ji and LI Tianrui. Small object detection by DETR via information augmentation and adaptive feature fusion[C]. 2024 ACM ICMR Workshop on Multimodal Video Retrieval, New York, USA, 2024: 39–44. doi: 10.1145/3664524.3675362.
[6]	张明明, 郑光迪, 万鸣, 等. 一种基于YOLOv5的改进航拍图像识别算法[J/OL]. 激光技术, 1–20. https://link.cnki.net/urlid/51.1125.TN.20250918.1341.012, 2025. ZHANG Mingming, ZHENG Guangdi, WAN Ming, et al. An improved aerial image recognition algorithm based on YOLOv5[J/OL]. Laser Technology, 1–20. https://link.cnki.net/urlid/51.1125.TN.20250918.1341.012, 2025.
[7]	杨智能, 钟小勇, 李华耀, 等. 改进YOLOv8n的航拍小目标检测算法[J]. 电光与控制, 2025, 32(7): 27–32,78. doi: 10.3969/j.issn.1671-637X.2025.07.005. YANG Zhineng, ZHONG Xiaoyong, LI Huayao, et al. Aerial small target detection based on improved YOLOv8n algorithm[J]. Electronics Optics & Control, 2025, 32(7): 27–32,78. doi: 10.3969/j.issn.1671-637X.2025.07.005.
[8]	LU Yanfeng, GAO Jingwen, YU Qian, et al. A cross-scale and illumination invariance-based model for robust object detection in traffic surveillance scenarios[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(7): 6989–6999. doi: 10.1109/TITS.2023.3264573.
[9]	WU Dangxuan, LI Xiuhong, LI Boyuan, et al. A lightweight two-level nested FPN network for infrared small target detection[J]. IEEE Geoscience and Remote Sensing Letters, 2024, 21: 6011505. doi: 10.1109/LGRS.2024.3412244.
[10]	SHI Jianyu, JIA Yuan, ZHOU Gang, et al. Small target insect detection based on improved YOLOv8n[C]. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, 2025: 1–5. doi: 10.1109/ICASSP49660.2025.10890801.
[11]	CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]. The 16th European Conference on Computer Vision, Glasgow, UK, 2020: 213–229. doi: 10.1007/978-3-030-58452-8_13.
[12]	戴铮, 刘骁佳, 潘泉. 基于改进DETR算法的焊缝缺陷检测方法研究[J]. 电子与信息学报, 2025, 47(7): 2298–2307. doi: 10.11999/JEIT241009. DAI Zheng, LIU Xiaojia, and PAN Quan. Research on weld defect detection method based on improved DETR[J]. Journal of Electronics & Information Technology, 2025, 47(7): 2298–2307. doi: 10.11999/JEIT241009.
[13]	ZHU Xizhou, SU Weijie, LU Lewei, et al. Deformable DETR: Deformable transformers for end-to-end object detection[C]. The 9th International Conference on Learning Representations, 2021.
[14]	沈靖夫, 张元良, 刘飞跃, 等. 基于深度学习的水面无人清理船目标检测综述[J]. 价值工程, 2024, 43(13): 157–160. doi: 10.3969/j.issn.1006-4311.2024.13.044. SHEN Jingfu, ZHANG Yuanliang, LIU Feiyue, et al. A review of target detection for unmanned surface cleaning ships based on deep learning[J]. Value Engineering, 2024, 43(13): 157–160. doi: 10.3969/j.issn.1006-4311.2024.13.044.
[15]	胡佳乐, 周敏, 申飞. 面向无人机小目标的RTDETR改进检测算法[J]. 计算机工程与应用, 2024, 60(20): 198–206. doi: 10.3778/j.issn.1002-8331.2404-0114. HU Jiale, ZHOU Min, and SHEN Fei. Improved detection algorithm of RTDETR for UAV small target[J]. Computer Engineering and Applications, 2024, 60(20): 198–206. doi: 10.3778/j.issn.1002-8331.2404-0114.
[16]	程鑫淼, 张雪松, 曹冰洁, 等. 改进RT-DETR的小目标检测方法研究[J]. 计算机工程与应用, 2025, 61(15): 144–155. doi: 10.3778/j.issn.1002-8331.2501-0293. CHENG Xinmiao, ZHANG Xuesong, CAO Bingjie, et al. Research on small object detection method of improved RT-DETR[J]. Computer Engineering and Applications, 2025, 61(15): 144–155. doi: 10.3778/j.issn.1002-8331.2501-0293.
[17]	ZHAO Yian, LV Wenyu, XU Shangliang, et al. DETRs beat YOLOs on real-time object detection[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2024: 16965–16974. doi: 10.1109/CVPR52733.2024.01605.
[18]	庞玉东, 李志星, 刘伟杰, 等. 基于改进实时检测Transformer的塔机上俯视场景小目标检测模型[J]. 计算机应用, 2024, 44(12): 3922–3929. doi: 10.11772/j.issn.1001-9081.2023121796. PANG Yudong, LI Zhixing, LIU Weijie, et al. Small target detection model in overlooking scenes on tower cranes based on improved real-time detection transformer[J]. Journal of Computer Applications, 2024, 44(12): 3922–3929. doi: 10.11772/j.issn.1001-9081.2023121796.
[19]	LIU Ruoyuan, ZHANG Xizheng, JIN Shengwei, et al. A small target detection model based on an improved RT-DETR[C]. 2024 4th International Conference on Industrial Automation, Robotics and Control Engineering (IARCE), Chengdu, China, 2024: 434–438. doi: 10.1109/IARCE64300.2024.00086.
[20]	王满利, 窦泽亚, 蔡明哲, 等. 基于高分辨扩展金字塔的场景文本检测[J]. 电子与信息学报, 2025, 47(7): 2334–2346. doi: 10.11999/JEIT241017. WANG Manli, DOU Zeya, CAI Mingzhe, et al. Scene text detection based on high resolution extended pyramid[J]. Journal of Electronics & Information Technology, 2025, 47(7): 2334–2346. doi: 10.11999/JEIT241017.
[21]	邵延华, 张铎, 楚红雨, 等. 基于深度学习的YOLO目标检测综述[J]. 电子与信息学报, 2022, 44(10): 3697–3708. doi: 10.11999/JEIT210790. SHAO Yanhua, ZHANG Duo, CHU Hongyu, et al. A review of YOLO object detection based on deep learning[J]. Journal of Electronics & Information Technology, 2022, 44(10): 3697–3708. doi: 10.11999/JEIT210790.
[22]	SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization[C]. The IEEE International Conference on Computer Vision, Venice, Italy, 2017: 618–626. doi: 10.1109/ICCV.2017.74.