高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于双深度Q网络算法的无人机辅助密集网络资源优化策略

陈佳美 孙慧雯 李玉峰 王宇鹏 别玉霞

陈佳美, 孙慧雯, 李玉峰, 王宇鹏, 别玉霞. 基于双深度Q网络算法的无人机辅助密集网络资源优化策略[J]. 电子与信息学报, 2025, 47(8): 2621-2629. doi: 10.11999/JEIT250021
引用本文: 陈佳美, 孙慧雯, 李玉峰, 王宇鹏, 别玉霞. 基于双深度Q网络算法的无人机辅助密集网络资源优化策略[J]. 电子与信息学报, 2025, 47(8): 2621-2629. doi: 10.11999/JEIT250021
CHEN Jiamei, SUN Huiwen, LI Yufeng, WANG Yupeng, BIE Yuxia. Double Deep Q Network Algorithm-based Unmanned Aerial Vehicle-assisted Dense Network Resource Optimization Strategy[J]. Journal of Electronics & Information Technology, 2025, 47(8): 2621-2629. doi: 10.11999/JEIT250021
Citation: CHEN Jiamei, SUN Huiwen, LI Yufeng, WANG Yupeng, BIE Yuxia. Double Deep Q Network Algorithm-based Unmanned Aerial Vehicle-assisted Dense Network Resource Optimization Strategy[J]. Journal of Electronics & Information Technology, 2025, 47(8): 2621-2629. doi: 10.11999/JEIT250021

基于双深度Q网络算法的无人机辅助密集网络资源优化策略

doi: 10.11999/JEIT250021 cstr: 32379.14.JEIT250021
基金项目: 国家自然科学基金(61501306),辽宁省教育厅基金(LJKMZ20220519, LJKMZ0220526),沈阳市自然科学基金专项(23-503-6-18),学校科研基金(2019-1-ZZLX-07)
详细信息
    作者简介:

    陈佳美:女,副教授,研究方向为空地无线网络资源管理

    孙慧雯:女,硕士生,研究方向为空地无线网络资源管理

    李玉峰:男,教授,研究方向为图像处理

    王宇鹏:男,教授,研究方向为自组织网络与车联网

    别玉霞:女,副教授,研究方向为卫星网络

    通讯作者:

    李玉峰 li_yufeng@126.com

  • 中图分类号: TN92

Double Deep Q Network Algorithm-based Unmanned Aerial Vehicle-assisted Dense Network Resource Optimization Strategy

Funds: The National Natural Science Foundation of China (61501306), The Project of Liaoning Province Education Department Foundation (LJKMZ20220519, LJKMZ20220526), Shenyang Natural Science Foundation Special Project (23-503-6-18), The School Scientific Research Foundation (2019-1-ZZLX-07)
  • 摘要: 为顺应未来网络向密集化与空间化方向的发展趋势,该文提出构建一种多基站共存的空地结合超密集复杂网络,并开发半分布式方案以优化网络资源。首先,建立包括宏基站、微基站和无人机(UAV)空中基站在内的多种基站共存的超密集复杂网络构架。在此基础上,针对传统完全集中式方案存在的计算负担重、响应速度慢以及分布式方案缺乏全局优化视角等问题,提出一种半分布式的双深度Q网络(DDQN)功率控制方案。该方案旨在优化网络能效,通过分布式决策与集中训练相结合的方式,有效平衡了计算复杂度和性能优化。具体而言,半分布式方案利用DDQN算法在基站侧进行分布式决策,同时引入集中式网络训练器以确保整体网络的能效最优。仿真结果表明,所提出的半分布式DDQN方案能够很好地适应密集复杂网络结构,与传统深度Q网络(DQN)相比,在能效和总吞吐量方面均取得了显著提升。
  • 图  1  超密集UAV辅助网络示意图

    图  2  DDQN解决方案结构图

    图  3  DDQN算法、DQN算法与Q-learning算法获得的系统总吞吐量性能随迭代次数变化关系

    图  4  DDQN算法、DQN算法与Q-learning算法获得的EE性能随迭代次数变化关系

    图  5  DDQN算法与DQN算法损失函数随迭代次数变化关系5

    图  6  DDQN算法、DQN算法与Q-learning算法获得的接入成功率性能随迭代次数变化关系

    图  7  接入成功率达到80%时,DDQN算法、DQN算法与Q-learning算法在不同用户数时所需的迭代次数

    1  DDQN功率分配算法

     (1)输入:动作值函数$ Q{\text{(}}{{\boldsymbol{s}}_t}{\text{,}}{{\boldsymbol{a}}_t}{\text{,}}{\omega ^{{\mathrm{predict}}}}{\text{)}} $的初始值,状态与动作的
     初始值为随机值$ {{\boldsymbol{s}}_t} = {{\boldsymbol{s}}_0} $, $ {{\boldsymbol{a}}_t} = {{\boldsymbol{a}}_0} $,初始权重为随机值
     $ {\omega ^{{\mathrm{predict}}}} = \omega _0^{{\mathrm{predict}}} $
     (2)输出:功率控制策略$ {P_n} $
     (3)初始化容量为$ D $的经验回放池缓冲区
     (4)初始化目标DNN权重,该权重与估计DNN权重相等
     (5)对于$ k $=1,$ K $
     (6)  0~1选择随机值作为$ \varepsilon $
     (7)   以概率$ \varepsilon $选择一个随机动作$ {{\boldsymbol{a}}_t} $
     (8)   以概率$ 1 - \varepsilon $选择$ {{\boldsymbol{a}}_{t + 1}} = \max Q{\text{(}}{{\boldsymbol{s}}_t},{{\boldsymbol{a}}_t},{\omega ^{{\mathrm{target}}}}{\text{)}} $
     (9)   根据式(22)减小$ \varepsilon $的值
     (10)   获得下一状态$ {\boldsymbol{{s}}_{t + 1}} $和奖励$ {r_t} $
     (11) 在经验回放池缓冲区中存储$ \left\langle {{{\boldsymbol{s}}_t}{\text{,}}{{\boldsymbol{a}}_t}{\text{,}}{r_t}{\text{,}}{{\boldsymbol{s}}_{t + 1}}} \right\rangle $
     (12)  如果$ k {\text{ \gt }}D $,则
     (13)   选择从经验回放池缓冲区选择$ J $组$ \left\langle {{{\boldsymbol{s}}_t}{\text{,}}{{\boldsymbol{a}}_t}{\text{,}}{r_t}{\text{,}}{{\boldsymbol{s}}_{t + 1}}} \right\rangle $
     (14)   根据式(21)通过最小化损失函数更新$ {\omega ^{{\mathrm{predict}}}} $
     (15)   如果$ k = g $,则
     (16)    将估计DNN的参数复制到目标DNN中
     (17)   结束
     (18)   结束
     (19)结束
    下载: 导出CSV
  • [1] ZHANG Jifa, LU Weidang, XING Chengwei, et al. Intelligent integrated sensing and communication: A survey[J]. Science China Information Sciences, 2025, 68(3): 131301. doi: 10.1007/s11432-024-4205-8.
    [2] 林永昌. 海上应急关键信息数据收集与传输技术研究[J]. 数字通信世界, 2024(6): 69–71. doi: 10.3969/J.ISSN.1672-7274.2024.06.021.

    LIN Yongchang. Research on data collection and transmission technology for key emergency information at sea[J]. Digital Communication World, 2024(6): 69–71. doi: 10.3969/J.ISSN.1672-7274.2024.06.021.
    [3] 付振江, 罗俊松, 宁进, 等. 无人机集群通信的应用现状及展望[J]. 无线电工程, 2023, 53(1): 3–10. doi: 10.3969/j.issn.1003-3106.2023.01.001.

    FU Zhenjiang, LUO Junsong, NING Jin, et al. Application status and prospect of UAV swarm communications[J]. Radio Engineering, 2023, 53(1): 3–10. doi: 10.3969/j.issn.1003-3106.2023.01.001.
    [4] CHENG Longbo, XU Zixuan, ZHOU Jianshan, et al. Adaptive spectrum anti-jamming in UAV-enabled air-to-ground networks: A bimatrix stackelberg game approach[J]. Electronics, 2023, 12(20): 4344. doi: 10.3390/ELECTRONICS12204344.
    [5] LI Yifan, SHU Feng, SHI Baihua, et al. Enhanced RSS-based UAV localization via trajectory and multi-base stations[J]. IEEE Communications Letters, 2021, 25(6): 1881–1885. doi: 10.1109/LCOMM.2021.3061104.
    [6] YANG Siming, SHAN Zheng, CAO Jiang, et al. Path planning of UAV base station based on deep reinforcement learning[J]. Procedia Computer Science, 2022, 202: 89–104. doi: 10.1016/J.PROCS.2022.04.013.
    [7] 熊婉寅, 毛剑, 刘子雯, 等. 软件定义网络中流规则安全性研究进展[J]. 西安电子科技大学学报, 2023, 50(6): 172–194. doi: 10.19665/j.issn1001-2400.20230904.

    XIONG Wanyin, MAO Jian, LIU Ziwen, et al. Advances in security analysis of software-defined networking flow rules[J]. Journal of Xidian University, 2023, 50(6): 172–194. doi: 10.19665/j.issn1001-2400.20230904.
    [8] XIA Jingming, LIU Yufeng, and TAN Ling. Joint optimization of trajectory and task offloading for cellular-connected multi-UAV mobile edge computing[J]. Chinese Journal of Electronics, 2024, 33(3): 823–832. doi: 10.23919/cje.2022.00.159.
    [9] CONG Jiayi, LI Bin, GUO Xianzhen, et al. Energy management strategy based on deep Q-network in the solar-powered UAV communications system[C]. IEEE International Conference on Communications Workshops (ICC Workshops), Montreal, Canada, 2021: 1–6. doi: 10.1109/ICCWorkshops50388.2021.9473509.
    [10] DAI Zhaojun, ZHANG Yan, ZHANG Wancheng, et al. A multi-agent collaborative environment learning method for UAV deployment and resource allocation[J]. IEEE Transactions on Signal and Information Processing over Networks, 2022, 8: 120–130. doi: 10.1109/tsipn.2022.3150911.
    [11] FU Shu, FENG Xue, SULTANA A, et al. Joint power allocation and 3D deployment for UAV-BSs: A game theory based deep reinforcement learning approach[J]. IEEE Transactions on Wireless Communications, 2024, 23(1): 736–748. doi: 10.1109/TWC.2023.3281812.
    [12] SILVA F A, FE I, BRITO C, et al. Aerial computing: Enhancing mobile cloud computing with unmanned aerial vehicles as data bridges—A Markov chain based dependability quantification[J]. ICT Express, 2024, 10(2): 406–411. doi: 10.1016/J.ICTE.2023.10.002.
    [13] CHEN Lin, WANG Jianxiao, WU Zhanyuan, et al. 5G and energy internet planning for power and communication network expansion[J]. iScience, 2024, 27(3): 109290. doi: 10.1016/J.ISCI.2024.109290.
    [14] Propagation data and prediction methods for the design of terrestrial broadband millimetric radio access systems[R]. Geneva, Switzerland, Rec. P. 1410–2, P Series, Radiowave Propagation, 2003.
    [15] MIAO Wang, LUO Chunbo, MIN Geyong, et al. Lightweight 3-D beamforming design in 5G UAV broadcasting communications[J]. IEEE Transactions on Broadcasting, 2020, 66(2): 515–524. doi: 10.1109/TBC.2020.2990564.
  • 加载中
图(7) / 表(1)
计量
  • 文章访问数:  282
  • HTML全文浏览量:  171
  • PDF下载量:  43
  • 被引次数: 0
出版历程
  • 收稿日期:  2025-01-10
  • 修回日期:  2025-04-17
  • 网络出版日期:  2025-05-10
  • 刊出日期:  2025-08-27

目录

    /

    返回文章
    返回