Joint Secure Transmission and Trajectory Optimization for Reconfigurable Intelligent Surface-aided Non-Terrestrial Networks
-
摘要: 由于卫星与地面用户之间的直连受限于覆盖范围和链路质量以及非地面网络存在窃听威胁等问题,该文考虑一个无人机中继的非地面网络安全传输系统,引入可重构智能超表面(RIS),提高合法用户信号质量。同时为了兼顾系统高传输速率和高安全需求,该文设计卫星到无人机的传输速率与地面合法用户的安全速率的加权和作为系统效用,并以此作为优化目标,进而提出一种基于双层双延迟深度确定性策略梯度(TTD3)的联合卫星与无人机波束成形、RIS相移矩阵以及无人机轨迹优化方法,通过采用双层深度强化学习结构解耦波束成形和轨迹优化两个子问题,实现系统效用最大化。仿真结果验证了所提方法在动态非地面网络环境下的有效性,同时在高安全需求下,通过对比不同算法、不同配置方案以及不同RIS元件数量下的仿真结果,证明了该文所提方法能够提升系统安全传输性能。Abstract:
Objective The proliferation of technologies such as the Internet of Things, smart cities, and next-generation mobile communications has made Non-Terrestrial Networks (NTNs) increasingly important for global communication. Future communication systems are expected to rely heavily on NTNs to provide seamless global coverage and efficient data transmission. However, current NTNs face challenges, including limited coverage and link quality in direct satellite-to-ground user connections, as well as eavesdropping threats. To address these challenges, a system integrating Reconfigurable Intelligent Surfaces (RIS) with a twin-layer Deep Reinforcement Learning (DRL) algorithm is proposed. This approach aims to satisfy the system’s requirements for high transmission rates and enhanced security, improving the signal strength for legitimate users while facilitating real-time updates and optimization of channel state information in NTNs. Methods First, an RIS-aided downlink NTNs system using an Unmanned Aerial Vehicle (UAV) as a relay is established. To balance the system’s transmission rate and security requirements, the weighted sum of the satellite-to-UAV transmission rate and the secure rate of the legitimate ground user is designed as the system utility, which serves as the optimization objective. A joint optimization method based on the Twin-Twin Delayed Deep Deterministic Policy Gradient (TTD3) algorithm is then proposed. This method jointly optimizes satellite and UAV beamforming, the RIS phase shift matrix, and UAV trajectory. The algorithm divides the optimization problem into two layers for solution. The first-layer DRL optimizes satellite and UAV beamforming, as well as the RIS phase shift matrix. The second-layer DRL optimizes the UAV’s trajectory based on its position, user mobility, and channel state information. The twin DRL shares the same reward function, guiding the agents in each layer to adjust their actions and explore optimal strategies, ultimately enhancing the system’s utility. Results and Discussions (1) Compared to the Deep Deterministic Policy Gradient (DDPG), the proposed TTD3 algorithm exhibits smaller dynamic fluctuations, demonstrating greater stability and robustness ( Fig. 2 ). (2) The UAV trajectory and user secrecy rate performance under four different schemes and algorithms show that the proposed method balances service for legitimate users. The UAV trajectory is smoother compared to that based on DDPG, and the overall user secrecy rate is also higher. This confirms that the proposed method can adapt to dynamically changing NTNs environments while improving user secrecy rates (Fig. 3 ,Fig. 4 ). (3) As the number of RIS reflecting elements increases, the degrees of freedom and precision of beamforming improve. Therefore, the overall user secrecy rates of different algorithms increase, resulting in enhanced system performance (Fig. 5 ).Conclusions This paper investigates an RIS-assisted downlink secure transmission system for NTNs, addressing the presence of eavesdropping threats. To meet the requirements of high transmission rates and security across different scenarios, the optimization objective is formulated as the weighted sum of the transmission rate from the satellite to the UAV and the secrecy rate of legitimate ground users. A TTD3-based joint optimization method for satellite and UAV beamforming, RIS phase shift matrix, and UAV trajectory is proposed. By adopting a twin-layer DRL structure, the beamforming and trajectory optimization subproblems are decoupled to maximize system utility. Simulation results validate the effectiveness of the proposed algorithm. Additionally, comparisons across different algorithms, RIS element counts, and schemes in high-security-demand scenarios demonstrate that the TTD3 algorithm is well-suited for dynamically changing NTNs environments and can significantly enhance system transmission performance. Future research will explore integrating emerging technologies, such as federated learning and meta-learning, to achieve distributed, low-latency policy optimization, thereby facilitating network resource optimization and interference analysis in large-scale, multi-satellite, and multi-UAV complex scenarios. -
1 基于TTD3算法的NTNs安全传输与轨迹优化流程
初始化1:TTD3中的第1层TD3的6个神经网络参数以及第2层
TD3的6个神经网络参数;初始化2:软更新因子$\psi $,每次迭代步数${N_{{\mathrm{step}}}}$,迭代次数
Eposide,经验存放空间${{B}}$,更新间隔$C$,批次大小$v$;(1) for ${\mathrm{step}} = 1$ to Eposide do (2) 初始化UAV的位置、用户的位置以及信道状态; (3) for ${\mathrm{step}} = 1$ to ${N_{{\mathrm{step}}}}$ do (4) 获得$ {{\boldsymbol{h}}_{\rm{S,U}}},{{\mathrm{SINR}}_{\rm{S,U}}},{{\boldsymbol{h}}_{{\rm U},i}} + {{\boldsymbol{h}}_{{\rm R},i}}{\boldsymbol{\varTheta}} {h_{{\mathrm{U,R}}}} $作为${{\boldsymbol{s}}_1}$, $ {\boldsymbol{q}} $作
为${{\boldsymbol{s}}_2}$;(5) 根据式(21)产生动作${{\boldsymbol{a}}_1}$和${{\boldsymbol{a}}_2}$; (6) 执行相应的动作获得相应的即时奖励$ {r_1} $和${r_2}$,并观察
新状态${\boldsymbol{s}}_1^{'}$和${\boldsymbol{s}}_2^{'}$;(7) 将状态转移元组$({{\boldsymbol{s}}_1},{{\boldsymbol{a}}_1},{r_1},{\boldsymbol{s}}_1^{'})$和
$ ({{\boldsymbol{s}}_2},{{\boldsymbol{a}}_2},{r_2},{{\boldsymbol{s}}_2}^{'}) $存储在$B$中;(8) 随机抽取$v$条经验进行训练; (9) 根据式(27)获得目标值; (10) 根据策略延迟更新机制更新Actor网络和Critic网络参数; (11) 以式(26)对目标Actor网络和Critic网络参数更新; (12) end for (13) end for -
[1] AZARI M M, SOLANKI S, CHATZINOTAS S, et al. Evolution of non-terrestrial networks from 5G to 6G: A survey[J]. IEEE Communications Surveys & Tutorials, 2022, 24(4): 2633–2672. doi: 10.1109/COMST.2022.3199901. [2] ZHOU Di, SHENG Min, LI Jiandong, et al. Aerospace integrated networks innovation for empowering 6G: A survey and future challenges[J]. IEEE Communications Surveys & Tutorials, 2023, 25(2): 975–1019. doi: 10.1109/COMST.2023.3245614. [3] JIANG Bin, YAN Yingchun, YOU Li, et al. Robust secure transmission for satellite communications[J]. IEEE Transactions on Aerospace and Electronic Systems, 2023, 59(2): 1598–1612. doi: 10.1109/TAES.2022.3203027. [4] LI Yabo, ZHANG Haijun, LONG Keping, et al. Exploring sum rate maximization in UAV-based Multi-IRS networks: IRS association, UAV altitude, and phase shift design[J]. IEEE Transactions on Communications, 2022, 70(11): 7764–7774. doi: 10.1109/TCOMM.2022.3206884. [5] KHAN W U, LAGUNAS E, MAHMOOD A, et al. RIS-assisted energy-efficient LEO satellite communications with NOMA[J]. IEEE Transactions on Green Communications and Networking, 2024, 8(2): 780–790. doi: 10.1109/TGCN.2023.3344102. [6] ZHANG Haijun, HUANG Miaolin, ZHOU Huan, et al. Capacity maximization in RIS-UAV networks: A DDQN-based trajectory and phase shift optimization approach[J]. IEEE Transactions on Wireless Communications, 2023, 22(4): 2583–2591. doi: 10.1109/TWC.2022.3212830. [7] KHAN W U, LAGUNAS E, ALI Z, et al. Opportunities for physical layer security in UAV communication enhanced with intelligent reflective surfaces[J]. IEEE Wireless Communications, 2022, 29(6): 22–28. doi: 10.1109/MWC.001.2200125. [8] LI Jingyi, XU Sai, LIU Jiajia, et al. Reconfigurable intelligent surface enhanced secure aerial-ground communication[J]. IEEE Transactions on Communications, 2021, 69(9): 6185–6197. doi: 10.1109/TCOMM.2021.3086517. [9] GUO Xufeng, CHEN Yuanbin, and WANG Ying. Learning-based robust and secure transmission for reconfigurable intelligent surface aided millimeter wave UAV communications[J]. IEEE Wireless Communications Letters, 2021, 10(8): 1795–1799. doi: 10.1109/LWC.2021.3081464. [10] YANG Helin, LIU Shuai, XIAO Liang, et al. Learning-based reliable and secure transmission for UAV-RIS-assisted communication systems[J]. IEEE Transactions on Wireless Communications, 2024, 23(7): 6954–6967. doi: 10.1109/TWC.2023.3336535. [11] 赵柏, 林敏, 肖圣杰, 等. 基于速率分割的可重构智能表面辅助星地融合网络鲁棒安全传输方案[J]. 通信学报, 2023, 44(12): 50–60. doi: 10.11959/j.issn.1000-436x.2023221.ZHAO Bai, LIN Min, XIAO Shengjie, et al. Rate splitting based robust secure transmission scheme in RIS-assisted satellite-terrestrial integrated network[J]. Journal on Communications, 2023, 44(12): 50–60. doi: 10.11959/j.issn.1000-436x.2023221. [12] ZHAO Bai, LIN Min, CHENG Ming, et al. Robust downlink transmission design in IRS-assisted cognitive satellite and terrestrial networks[J]. IEEE Journal on Selected Areas in Communications, 2023, 41(8): 2514–2529. doi: 10.1109/JSAC.2023.3288234. [13] NGO Q T, PHAN K T, MAHMOOD A, et al. Hybrid IRS-assisted secure satellite downlink communications: A fast deep reinforcement learning approach[J]. IEEE Transactions on Emerging Topics in Computational Intelligence, 2024, 8(4): 2858–2869. doi: 10.1109/TETCI.2024.3378605. [14] LI Huifang, LI Jing, LIU Meng, et al. UAV-assisted secure communication for coordinated satellite-terrestrial networks[J]. IEEE Communications Letters, 2023, 27(7): 1709–1713. doi: 10.1109/LCOMM.2023.3267119. [15] ZHOU Gui, PAN Chunhua, REN Hong, et al. Stochastic learning-based robust beamforming design for RIS-aided millimeter-wave systems in the presence of random blockages[J]. IEEE Transactions on Vehicular Technology, 2021, 70(1): 1057–1061. doi: 10.1109/TVT.2021.3049257. [16] LIN Zhi, NIU Hehao, AN Kang, et al. Refracting RIS-aided hybrid satellite-terrestrial relay networks: Joint beamforming design and optimization[J]. IEEE Transactions on Aerospace and Electronic Systems, 2022, 58(4): 3717–3724. doi: 10.1109/TAES.2022.3155711. -