Double Deep Q Network Algorithm-based Unmanned Aerial Vehicle-assisted Dense Network Resource Optimization Strategy

CHEN Jiamei; SUN Huiwen; LI Yufeng; WANG Yupeng; BIE Yuxia

doi:10.11999/JEIT250021

Volume 47 Issue 8

Aug. 2025

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2025 > 47(8): 2621-2629

CHEN Jiamei, SUN Huiwen, LI Yufeng, WANG Yupeng, BIE Yuxia. Double Deep Q Network Algorithm-based Unmanned Aerial Vehicle-assisted Dense Network Resource Optimization Strategy[J]. Journal of Electronics & Information Technology, 2025, 47(8): 2621-2629. doi: 10.11999/JEIT250021

Citation:

CHEN Jiamei, SUN Huiwen, LI Yufeng, WANG Yupeng, BIE Yuxia. Double Deep Q Network Algorithm-based Unmanned Aerial Vehicle-assisted Dense Network Resource Optimization Strategy[J]. Journal of Electronics & Information Technology, 2025, 47(8): 2621-2629. doi: 10.11999/JEIT250021

Citation:

CHEN Jiamei, SUN Huiwen, LI Yufeng, WANG Yupeng, BIE Yuxia. Double Deep Q Network Algorithm-based Unmanned Aerial Vehicle-assisted Dense Network Resource Optimization Strategy[J]. Journal of Electronics & Information Technology, 2025, 47(8): 2621-2629. doi: 10.11999/JEIT250021

PDF( 2557 KB)

Double Deep Q Network Algorithm-based Unmanned Aerial Vehicle-assisted Dense Network Resource Optimization Strategy

doi: 10.11999/JEIT250021 cstr: 32379.14.JEIT250021

College of Electronic Information Engineering, Shenyang Aerospace University, Shenyang 110136, China

Funds: The National Natural Science Foundation of China (61501306), The Project of Liaoning Province Education Department Foundation (LJKMZ20220519, LJKMZ20220526), Shenyang Natural Science Foundation Special Project (23-503-6-18), The School Scientific Research Foundation (2019-1-ZZLX-07)

Received Date: 2025-01-10
Rev Recd Date: 2025-04-17

Available Online: 2025-05-10

Publish Date: 2025-08-27

Abstract

Abstract

Objective To address the future trend of network densification and spatial distribution, this study proposes a multi-base station air–ground integrated ultra-dense network architecture and develops a semi-distributed scheme for resource optimization. The network comprises coexisting macro, micro, and Unmanned Aerial Vehicle (UAV) base stations. A semi-distributed Double Deep Q Network (DDQN)-based power control scheme is designed to reduce computational burden, improve response speed, and overcome the lack of global optimization in conventional fully centralized approaches. The proposed scheme enhances energy efficiency by combining distributed decision-making at the base station level with centralized training via a network trainer, enabling a balance between computational complexity and performance. The DDQN algorithm facilitates local decision-making while centralized coordination ensures overall network optimization. Methods This study establishes a complex dense network model for air–ground integration with coexisting macro, micro, and UAV base stations, and proposes a semi-distributed DDQN scheme to improve network energy efficiency. The methods are as follows: (1) Construct an integrated air–ground dense network model in which macro, micro, and UAV base stations share the spectrum through a cooperative mechanism, thereby overcoming the performance bottlenecks of conventional heterogeneous networks. (2) Develop an improved semi-distributed DDQN algorithm that enhances Q-value estimation accuracy, addressing the limitations of traditional centralized and distributed control modes and mitigating Q-value overestimation observed in conventional Deep Q Network (DQN) approaches. (3) Introduce a disturbance factor to increase the probability of exploring random actions, strengthen the algorithm’s ability to escape local optima, and improve estimation accuracy. Results and Discussions Simulation results demonstrate that the proposed semi-distributed DDQN scheme effectively adapts to dense and complex network topologies, yielding marked improvements in both energy efficiency and total throughput relative to traditional DQN and Q-learning algorithms. Key results include the following: The total throughput achieved by DDQN exceeds that of the baseline DQN and Q-learning algorithms (Fig. 3). In terms of energy efficiency, DDQN exhibits a clear advantage, converging to 84.60%, which is 15.18% higher than DQN (69.42%) and 17.1% higher than Q-learning (67.50%) (Fig. 4). The loss value of DDQN also decreases more rapidly and stabilizes at a lower level. With increasing iterations, the loss curve becomes smoother and ultimately converges to 100, which is 100 lower than that of DQN (Fig. 5). Moreover, DDQN achieves the highest user access success rate compared with DQN and Q-learning (Fig. 6). When the access success rate reaches 80%, DDQN requires significantly fewer iterations than the other two algorithms. This advantage becomes more pronounced under high user density. For example, when the number of users reaches 800, DDQN requires fewer iterations than both DQN and Q-learning to achieve comparable performance (Fig. 7). Conclusions This study proposes a semi-distributed DDQN strategy for intelligent control of base station transmission power in ultra-dense air–ground networks. Unlike traditional methods that target energy efficiency at the individual base station level, the proposed strategy focuses on optimizing the overall energy efficiency of the network system. By dynamically adjusting the transmission power of macro, micro, and airborne base stations through intelligent learning, the scheme achieves system-level coordination and adaptation. Simulation results confirm the superior adaptability and performance of the proposed DDQN scheme under complex and dynamic network conditions. Compared with conventional DQN and Q-learning approaches, DDQN exhibits greater flexibility and effectiveness in resource control, achieving higher energy efficiency and sustained improvements in total throughput. These findings offer a new approach for the design and management of integrated air–ground networks and provide a technical basis for the development of future large-scale dense network architectures.
- Air–ground dense network,
- Semi-distributed,
- Double Deep Q Network (DDQN) algorithm,
- Resource optimization

FullText(HTML)

References(15)

References

[1]	ZHANG Jifa, LU Weidang, XING Chengwei, et al. Intelligent integrated sensing and communication: A survey[J]. Science China Information Sciences, 2025, 68(3): 131301. doi: 10.1007/s11432-024-4205-8.
[2]	林永昌. 海上应急关键信息数据收集与传输技术研究[J]. 数字通信世界, 2024(6): 69–71. doi: 10.3969/J.ISSN.1672-7274.2024.06.021. LIN Yongchang. Research on data collection and transmission technology for key emergency information at sea[J]. Digital Communication World, 2024(6): 69–71. doi: 10.3969/J.ISSN.1672-7274.2024.06.021.
[3]	付振江, 罗俊松, 宁进, 等. 无人机集群通信的应用现状及展望[J]. 无线电工程, 2023, 53(1): 3–10. doi: 10.3969/j.issn.1003-3106.2023.01.001. FU Zhenjiang, LUO Junsong, NING Jin, et al. Application status and prospect of UAV swarm communications[J]. Radio Engineering, 2023, 53(1): 3–10. doi: 10.3969/j.issn.1003-3106.2023.01.001.
[4]	CHENG Longbo, XU Zixuan, ZHOU Jianshan, et al. Adaptive spectrum anti-jamming in UAV-enabled air-to-ground networks: A bimatrix stackelberg game approach[J]. Electronics, 2023, 12(20): 4344. doi: 10.3390/ELECTRONICS12204344.
[5]	LI Yifan, SHU Feng, SHI Baihua, et al. Enhanced RSS-based UAV localization via trajectory and multi-base stations[J]. IEEE Communications Letters, 2021, 25(6): 1881–1885. doi: 10.1109/LCOMM.2021.3061104.
[6]	YANG Siming, SHAN Zheng, CAO Jiang, et al. Path planning of UAV base station based on deep reinforcement learning[J]. Procedia Computer Science, 2022, 202: 89–104. doi: 10.1016/J.PROCS.2022.04.013.
[7]	熊婉寅, 毛剑, 刘子雯, 等. 软件定义网络中流规则安全性研究进展[J]. 西安电子科技大学学报, 2023, 50(6): 172–194. doi: 10.19665/j.issn1001-2400.20230904. XIONG Wanyin, MAO Jian, LIU Ziwen, et al. Advances in security analysis of software-defined networking flow rules[J]. Journal of Xidian University, 2023, 50(6): 172–194. doi: 10.19665/j.issn1001-2400.20230904.
[8]	XIA Jingming, LIU Yufeng, and TAN Ling. Joint optimization of trajectory and task offloading for cellular-connected multi-UAV mobile edge computing[J]. Chinese Journal of Electronics, 2024, 33(3): 823–832. doi: 10.23919/cje.2022.00.159.
[9]	CONG Jiayi, LI Bin, GUO Xianzhen, et al. Energy management strategy based on deep Q-network in the solar-powered UAV communications system[C]. IEEE International Conference on Communications Workshops (ICC Workshops), Montreal, Canada, 2021: 1–6. doi: 10.1109/ICCWorkshops50388.2021.9473509.
[10]	DAI Zhaojun, ZHANG Yan, ZHANG Wancheng, et al. A multi-agent collaborative environment learning method for UAV deployment and resource allocation[J]. IEEE Transactions on Signal and Information Processing over Networks, 2022, 8: 120–130. doi: 10.1109/tsipn.2022.3150911.
[11]	FU Shu, FENG Xue, SULTANA A, et al. Joint power allocation and 3D deployment for UAV-BSs: A game theory based deep reinforcement learning approach[J]. IEEE Transactions on Wireless Communications, 2024, 23(1): 736–748. doi: 10.1109/TWC.2023.3281812.
[12]	SILVA F A, FE I, BRITO C, et al. Aerial computing: Enhancing mobile cloud computing with unmanned aerial vehicles as data bridges—A Markov chain based dependability quantification[J]. ICT Express, 2024, 10(2): 406–411. doi: 10.1016/J.ICTE.2023.10.002.
[13]	CHEN Lin, WANG Jianxiao, WU Zhanyuan, et al. 5G and energy internet planning for power and communication network expansion[J]. iScience, 2024, 27(3): 109290. doi: 10.1016/J.ISCI.2024.109290.
[14]	Propagation data and prediction methods for the design of terrestrial broadband millimetric radio access systems[R]. Geneva, Switzerland, Rec. P. 1410–2, P Series, Radiowave Propagation, 2003.
[15]	MIAO Wang, LUO Chunbo, MIN Geyong, et al. Lightweight 3-D beamforming design in 5G UAV broadcasting communications[J]. IEEE Transactions on Broadcasting, 2020, 66(2): 515–524. doi: 10.1109/TBC.2020.2990564.