高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

面向铁路双无人机协同巡检轨迹优化的多智能体强化学习方法

黄高勇 宋俊 方旭明 闫莉 何蓉

黄高勇, 宋俊, 方旭明, 闫莉, 何蓉. 面向铁路双无人机协同巡检轨迹优化的多智能体强化学习方法[J]. 电子与信息学报. doi: 10.11999/JEIT251321
引用本文: 黄高勇, 宋俊, 方旭明, 闫莉, 何蓉. 面向铁路双无人机协同巡检轨迹优化的多智能体强化学习方法[J]. 电子与信息学报. doi: 10.11999/JEIT251321
HUANG Gaoyong, SONG Jun, FANG Xuming, YAN Li, HE Rong. Multi-Agent Reinforcement Learning Method for Dual-UAV Cooperative Trajectory Optimization in Railway Inspection[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT251321
Citation: HUANG Gaoyong, SONG Jun, FANG Xuming, YAN Li, HE Rong. Multi-Agent Reinforcement Learning Method for Dual-UAV Cooperative Trajectory Optimization in Railway Inspection[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT251321

面向铁路双无人机协同巡检轨迹优化的多智能体强化学习方法

doi: 10.11999/JEIT251321 cstr: 32379.14.JEIT251321
基金项目: 国家自然科学基金(62071393)
详细信息
    作者简介:

    黄高勇:男,讲师,研究方向为轨道交通无线通信网络优化

    宋俊:男,硕士生,研究方向为无人机辅助通信

    方旭明:男,教授,研究方向为无线与移动通信网络、交通通信与信息系统等

    闫莉:女,副教授,研究方向为铁路5G-R移动通信

    何蓉:女,副教授,研究方向为通感算一体化网络

    通讯作者:

    黄高勇 gyhuang@swjtu.edu.cn

  • 中图分类号: TN929.5

Multi-Agent Reinforcement Learning Method for Dual-UAV Cooperative Trajectory Optimization in Railway Inspection

Funds: The National Natural Science Foundation of China (62071393)
  • 摘要: 传统的铁路人工巡检和轨道车巡检方式存在效率低下、劳动强度大、存在安全隐患等问题,难以满足未来铁路智能化运维需求,而现有单无人机巡检方案在铁路保护区限制下存在覆盖盲区、数据同步性差等局限。为此,该文以任务质量最大化为目标,提出一种基于深度强化学习的双无人机协同巡检轨迹优化方法。为了解决能耗、避障、通信和编队保持等多重约束条件之间的耦合问题,该文构建了双无人机协同巡检优化模型,并设计了一种两阶段分层求解框架。第一阶段采用粒子群优化(PSO)算法为各巡检任务确定最优协同观测位置;第二阶段构建基于多智能体深度强化学习的轨迹优化模型,引入风险自适应探索噪声以提升强约束环境下的训练收敛稳定性,并提出一种改进的多智能体双延迟深度确定性策略梯度(MATD3)算法进行求解。仿真结果表明,相较于多种基准算法,所提方案在双机协同巡检场景下,无人机飞行路径长度缩短了约4.5%,累计能耗降低了8.9%,双机协同到达时间差缩短了30.3%,有效提升了复杂环境下的铁路巡检任务完成质量。
  • 图  1  基于双无人机协同的铁路巡检场景示意图

    图  2  基于MATD3双无人机协同巡检轨迹优化算法框图

    图  3  风险自适应噪声机制消融实验收敛性能对比

    图  4  不同巡检任务数量下各算法收敛性能对比

    图  5  不同巡检任务数量下各算法双无人机巡检优化轨迹3D图

    图  6  各算法的无人机累计能耗对比

    图  7  各算法的无人机导航误差对比

    图  8  各算法的巡检任务完成质量对比

    图  9  各算法的双无人机到达同一巡检点的时间步长差对比

    表  1  仿真参数

    参数 数值
    空气密度$ \rho $ 1.225 kg/m³
    旋翼表面积$ A $ 0.5 m2
    尖端速度$ {U}_{tip} $ 120 m/s
    无人机叶片产生的功率$ {P}_{0} $ 59.03 W
    悬停功率$ {P}_{1} $ 79.07 W
    平均转子速度 3.6 m/s
    无人机最大速度$ v_{\max }^{} $ 10 m/s
    无人机最大加速度$ a_{\max }^{} $ 2 m/s2
    视距链路衰减因子$ {\xi }_{\text{LoS}} $ 3db
    非视距链路衰减因子$ {\xi }_{\text{NLoS}} $ 23 db
    总带宽$ B $ 2 MHz
    下载: 导出CSV

    表  2  不同任务数量下各算法规划的无人机路径长度表(单位:m)

    算法任务数量=1任务数量=2任务数量=3
    DDPG94021539316342
    SAC94131407717838
    TD395811401917771
    MASAC95241364317499
    MATD393551302517395
    下载: 导出CSV
  • [1] AELA P, CHI H L, FARES A, et al. UAV-based studies in railway infrastructure monitoring[J]. Automation in Construction, 2024, 167: 105714. doi: 10.1016/j.autcon.2024.105714.
    [2] 秦暄阳, 张喆, 王浩宇, 等. 国内外铁路巡检无人机应用现状分析(上)[J]. 铁道技术监督, 2024, 52(1): 48–51,55. doi: 10.3969/j.issn.1006-9178.2024.01.018.

    QIN Xuanyang, ZHANG Zhe, WANG Haoyu, et al. Analysis on the application status of railway inspection unmanned aerial vihicle at home and abroad (Part 1 of 2)[J]. Railway Quality Control, 2024, 52(1): 48–51,55. doi: 10.3969/j.issn.1006-9178.2024.01.018.
    [3] LIU S, WANG Quandong, and LUO Yiping. A review of applications of visual inspection technology based on image processing in the railway industry[J]. Transportation Safety and Environment, 2019, 1(3): 185–204. doi: 10.1093/tse/tdz007.
    [4] WU Jianjie, PENG Limei, SHENG Wei, et al. Track gauge measurement based on model matching using UAV image[J]. Automation in Construction, 2023, 155: 105070. doi: 10.1016/j.autcon.2023.105070.
    [5] NARAZAKI Y. Autonomous vision-based inspection of RC railway bridges for rapid post-earthquake response and recovery[D]. [Ph. D. dissertation], University of Illinois at Urbana-Champaign, 2020.
    [6] ZHANG Ran, HAO Guangbo, ZHANG Kong, et al. Unmanned aerial vehicle navigation in underground structure inspection: A review[J]. Geological Journal, 2023, 58(6): 2454–2472. doi: 10.1002/gj.4763.
    [7] SHARMA R, PATEL K, SHAH S, et al. Aerial footage analysis using computer vision for efficient detection of points of interest near railway tracks[J]. Aerospace, 2022, 9(7): 370. doi: 10.3390/aerospace9070370.
    [8] 中华人民共和国国务院. 铁路安全管理条例[Z]. 国务院令第639号. 2013-08-17.

    State Council of the People's Republic of China. Regulations on railway safety management[Z]. State Council Order No. 639. Promulgated on August 17, 2013. (查阅网上资料, 不确定格式是否正确, 未找到本条文献英文翻译, 请确认).
    [9] 李浩, 牛洪蛟, 李夏洋, 等. 基于无人机协同编队控制的铁路智能巡检方法[J]. 铁路通信信号工程技术, 2025, 22(2): 11–17,70. doi: 10.3969/j.issn.1673-4440.2025.02.002.

    LI Hao, NIU Hongjiao, LI Xiayang, et al. Intelligent railway inspection method based on UAV cooperative formation control[J]. Railway Signalling & Communication Engineering, 2025, 22(2): 11–17,70. doi: 10.3969/j.issn.1673-4440.2025.02.002.
    [10] WAN Yuting, ZHONG Yanfei, MA Ailong, et al. An accurate UAV 3-D path planning method for disaster emergency response based on an improved multiobjective swarm intelligence algorithm[J]. IEEE Transactions on Cybernetics, 2023, 53(4): 2658–2671. doi: 10.1109/tcyb.2022.3170580.
    [11] 唐伦, 李质萱, 蒲昊, 等. 基于多智能体深度强化学习的无人机动态预部署策略[J]. 电子与信息学报, 2023, 45(6): 2007–2015. doi: 10.11999/JEIT220513.

    TANG Lun, LI Zhixuan, PU Hao, et al. A dynamic pre-deployment strategy of UAVs based on multi-agent deep reinforcement learning[J]. Journal of Electronics & Information Technology, 2023, 45(6): 2007–2015. doi: 10.11999/JEIT220513.
    [12] MEI Hao, ZHANG Haixia, ZHOU Xiaotian, et al. AoI minimization for air-ground integrated sensing and communication networks with jamming attack[J]. IEEE Transactions on Vehicular Technology, 2025, 74(8): 12776–12790. doi: 10.1109/TVT.2025.3558061.
    [13] FAN Xiao, WU Peiran, and XIA Minghua. Air-to-ground communications beyond 5G: Uav swarm formation control and tracking[J]. IEEE Transactions on Wireless Communications, 2024, 23(7): 8029–8043. doi: 10.1109/twc.2023.3347600.
    [14] WANG Changheng, WEI Zhiqing, JIANG Wangjun, et al. Cooperative sensing enhanced UAV path-following and obstacle avoidance with variable formation[J]. IEEE Transactions on Vehicular Technology, 2024, 73(6): 7501–7516. doi: 10.1109/tvt.2023.3348665.
    [15] 胡钰林, 吴鹏, 原晓鹏, 等. 海上无人集群联合轨迹设计方法[J]. 电子与信息学报, 2022, 44(3): 890–898. doi: 10.11999/JEIT211305.

    HU Yulin, WU Peng, YUAN Xiaopeng, et al. Joint trajectory design for unmanned marine cluster[J]. Journal of Electronics & Information Technology, 2022, 44(3): 890–898. doi: 10.11999/JEIT211305.
    [16] ZHENG Yuanshuai and CHEN Junting. Geography-aware optimal UAV 3D placement for LOS relaying: A geometry approach[J]. IEEE Transactions on Wireless Communications, 2024, 23(8): 9301–9314. doi: 10.1109/twc.2023.3301613.
    [17] 赵楠, 黄香港, 邓娜, 等. 无人机高能效立体覆盖中轨迹与资源优化[J]. 电子与信息学报, 2024, 46(9): 3553–3562. doi: 10.11999/JEIT240151.

    ZHAO Nan, HUANG Xianggang, DENG Na, et al. Trajectory and resource optimization in energy-efficient 3D coverage of unmanned aerial vehicle[J]. Journal of Electronics & Information Technology, 2024, 46(9): 3553–3562. doi: 10.11999/JEIT240151.
    [18] BAEK J, HAN S I, and HAN Y. Energy-efficient UAV routing for wireless sensor networks[J]. IEEE Transactions on Vehicular Technology, 2020, 69(2): 1741–1750. doi: 10.1109/TVT.2019.2959808.
    [19] ZENG Yong, XU Jie, and ZHANG Rui. Energy minimization for wireless communication with rotary-wing UAV[J]. IEEE Transactions on Wireless Communications, 2019, 18(4): 2329–2345. doi: 10.1109/twc.2019.2902559.
    [20] MORTEZAEI A, MIRAHMADI S S, and DERAKHSHAN F. A new era in railway track inspection: Drone based image processing integrated with IoT[C]. The 10th International Conference on Artificial Intelligence and Robotics (QICAR), Qazvin, Iran, 2024: 311–315. doi: 10.1109/qicar61538.2024.10496654.
    [21] 赵志超, 饶彬, 王涛, 等. 雷达网检测概率计算及性能评估[J]. 现代雷达, 2010, 32(7): 7–10. doi: 10.3969/j.issn.1004-7859.2010.07.002.

    ZHAO Zhichao, RAO Bin, WANG Tao, et al. Detection probability calculation and performance evaluation of radar network[J]. Modern Radar, 2010, 32(7): 7–10. doi: 10.3969/j.issn.1004-7859.2010.07.002.
    [22] AL-HOURANI A, KANDEEPAN S, and LARDNER S. Optimal LAP altitude for maximum coverage[J]. IEEE Wireless Communications Letters, 2014, 3(6): 569–572. doi: 10.1109/lwc.2014.2342736.
    [23] ACKERMANN J, GABLER V, OSA T, et al. Reducing overestimation bias in multi-agent domains using double centralized critics[EB/OL]. https://arxiv.org/abs/1910.01465, 2019. doi: 10.48550/arXiv.1910.01465.
    [24] LI Bin, YANG Rongrong, LIU Lei, et al. Service placement and trajectory design for heterogeneous tasks in multi-UAV edge computing networks[J]. IEEE Internet of Things Journal, 2025, 12(8): 9360–9371. doi: 10.1109/jiot.2024.3439350.
    [25] TIAN Jie, WANG Di, ZHANG Haixia, et al. Service satisfaction-oriented task offloading and UAV scheduling in UAV-enabled MEC networks[J]. IEEE Transactions on Wireless Communications, 2023, 22(12): 8949–8964. doi: 10.1109/twc.2023.3267330.
    [26] PU Yuan, WANG Shaochen, YANG Rui, et al. Decomposed soft actor-critic method for cooperative multi-agent reinforcement learning[EB/OL]. https://arxiv.org/abs/2104.06655, 2021. doi: 10.48550/arXiv.2104.06655.
    [27] HAARNOJA T, ZHOU A, ABBEEL P, et al. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor[C]. Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 2018: 1856–1865.
    [28] FUJIMOTO S, VAN HOOF H, and MEGER D. Addressing function approximation error in actor-critic methods[C]. Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 2018: 1582–1591.
    [29] LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[C]. The 4th International Conference on Learning Representations (ICLR), San Juan, Puerto Rico, 2016. doi: 10.1016/S1098-3015(10)67722-4. (查阅网上资料,未找到本条文献页码和doi,请确认).
  • 加载中
图(9) / 表(2)
计量
  • 文章访问数:  22
  • HTML全文浏览量:  13
  • PDF下载量:  0
  • 被引次数: 0
出版历程
  • 修回日期:  2026-03-24
  • 录用日期:  2026-03-24
  • 网络出版日期:  2026-04-19

目录

    /

    返回文章
    返回