Deep Reinforcement Learning Based Beamforming Algorithm for IRS Assisted Cognitive Radio System

LI Guoquan; CHENG Tao; GUO Yongcun; PANG Yu; LIN Jinzhao

doi:10.11999/JEIT240447

Volume 47 Issue 3

Mar. 2025

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2025 > 47(3): 657-665

LI Guoquan, CHENG Tao, GUO Yongcun, PANG Yu, LIN Jinzhao. Deep Reinforcement Learning Based Beamforming Algorithm for IRS Assisted Cognitive Radio System[J]. Journal of Electronics & Information Technology, 2025, 47(3): 657-665. doi: 10.11999/JEIT240447

Citation:

LI Guoquan, CHENG Tao, GUO Yongcun, PANG Yu, LIN Jinzhao. Deep Reinforcement Learning Based Beamforming Algorithm for IRS Assisted Cognitive Radio System[J]. Journal of Electronics & Information Technology, 2025, 47(3): 657-665. doi: 10.11999/JEIT240447

Citation:

PDF( 3120 KB)

Deep Reinforcement Learning Based Beamforming Algorithm for IRS Assisted Cognitive Radio System

doi: 10.11999/JEIT240447 cstr: 32379.14.JEIT240447

1.
School of Communications and Information Engineering, Chongqing University of Posts and Telecommunications, , Chongqing 400065, China
2.
Chongqing Key Laboratory of Optoelectronic Information Sensing and Microsystems, Chongqing 400065, China

Funds: The National Natural Science Foundation of China (U21A20447), The Foundation for Innovative Research Groups of the Natural Science Foundation of Chongqing (cstc2020jcyj-cxttX0002)

Received Date: 2024-06-04
Rev Recd Date: 2025-02-17

Available Online: 2025-02-26

Publish Date: 2025-03-01

Abstract

Abstract

Objective With the rapid development of wireless communication technologies, the demand for spectrum resources has significantly increased. Cognitive Radio (CR) has emerged as a promising solution to improve spectrum utilization by enabling Secondary Users (SUs) to access licensed spectrum bands without causing harmful interference to Primary Users (PUs). However, traditional CR networks face challenges in achieving high spectral efficiency due to limited control over the wireless environment. Intelligent Reflecting Surfaces (IRS) have recently been introduced as a revolutionary technology to enhance communication performance by dynamically reconfiguring the propagation environment. This paper aims to maximize the sum rate of SUs in an IRS-assisted CR network by jointly optimizing the active beamforming at the Secondary Base Station (SBS) and the passive beamforming at the IRS, subject to constraints on the maximum transmit power of the SBS, the interference tolerance of PUs, and the unit modulus of the IRS phase shifts. Methods To address the non-convex and highly coupled optimization problem, a Deep Reinforcement Learning (DRL)-based algorithm is proposed. Specifically, the problem is formulated as a Markov Decision Process (MDP), where the state space includes the Channel State Information (CSI) of the entire system, the Signal-to-Interference-plus-Noise Ratio (SINR) in the SU network, and the action space consists of the SBS beamforming vectors and the IRS phase shift matrix. The reward function is designed to maximize the sum rate of SUs while penalizing violations of the constraints. The Deep Deterministic Policy Gradient (DDPG) algorithm is used to solve the MDP, owing to its ability to handle continuous action spaces. The DDPG framework consists of an actor network, which outputs the optimal actions, and a critic network, which evaluates these actions based on the reward function. The training process involves interacting with the environment to learn the optimal policy, and the algorithm is fine-tuned to ensure convergence and robustness under varying system conditions. Results and Discussions Simulation results show that the proposed scheme achieves comparable sum rate performance with lower time complexity after optimization, compared to traditional optimization algorithms. The proposed algorithm significantly outperforms the no-IRS and IRS-random phase shift schemes (Fig. 5). The results demonstrate that the proposed algorithm achieves a sum rate close to that of alternating optimization-based approaches (Fig. 5), while substantially reducing computational complexity (Fig. 5, Table 2). Additionally, the impact of the number of IRS elements on the sum rate is examined (Fig. 6). As expected, the average reward increases with the number of reflecting elements, while the convergence time remains stable, indicating the robustness of the proposed algorithm. The DRL-based algorithm, starting from the identity matrix, can learn and adjust the beamforming vectors and phase shifts to approach the optimal solution through interaction with the environment (Fig. 7). It is also observed that the variance of the instantaneous reward increases with the transmit power. This is due to the larger dynamic range of the instantaneous reward at higher power levels, resulting in greater fluctuations and slower convergence. The relationship between average reward and time steps under different transmit power levels is presented, highlighting the sensitivity of the algorithm to high signal-to-noise ratios (Fig. 8). Moreover, it can be observed that a learning rate of 0.001 yields the best performance, while excessively high or low learning rates degrade performance (Fig. 9). The discount factor has a relatively smaller impact on performance compared to the learning rate (Fig. 10). Conclusions This paper proposes a DRL-based algorithm for joint active and passive beamforming optimization in an IRS-assisted CR network. The algorithm utilizes the DDPG framework to maximize the sum rate of SUs while adhering to constraints on transmit power, interference, and IRS phase shifts. Simulation results demonstrate that the proposed algorithm achieves comparable sum rate performance to traditional optimization methods, with significantly lower computational complexity. The findings also highlight the impact of DRL parameter settings on performance. Future work will focus on extending the proposed algorithm to multi-cell scenarios and incorporating imperfect CSI to enhance its robustness in practical environments.
- Intelligent Reflecting Surface (IRS),
- Cognitive Radio (CR),
- Deep Reinforcement Learning (DRL),
- Beamforming

FullText(HTML)

References(18)

References

[1]	LI Guoquan, HONG Zijie, PANG Yu, et al. Resource allocation for sum-rate maximization in NOMA-based generalized spatial modulation[J]. Digital Communications and Networks, 2022, 8(6): 1077–1084. doi: 10.1016/j.dcan.2022.02.005.
[2]	LI Xingwang, ZHENG Yike, ALSHEHRI M D, et al. Cognitive AmBC-NOMA IoV-MTS networks with IQI: Reliability and security analysis[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(2): 2596–2607. doi: 10.1109/TITS.2021.3113995.
[3]	李国权, 党刚, 林金朝, 等. RIS辅助的MISO系统安全鲁棒波束赋形算法[J]. 电子与信息学报, 2023, 45(8): 2867–2875. doi: 10.11999/JEIT220894. LI Guoquan, DANG Gang, LIN Jinzhao, et al. Secure and robust beamforming algorithm for RIS assisted MISO systems[J]. Journal of Electronics & Information Technology, 2023, 45(8): 2867–2875. doi: 10.11999/JEIT220894.
[4]	CHEN Guang, CHEN Yueyun, MAI Zhiyuan, et al. Joint multiple resource allocation for offloading cost minimization in IRS-assisted MEC networks with NOMA[J]. Digital Communications and Networks, 2023, 9(3): 613–627. doi: 10.1016/j.dcan.2022.10.029.
[5]	熊军洲, 李国权, 王钥涛, 等. 基于有源智能反射面反射单元分组的反射调制系统[J]. 电子与信息学报, 2024, 46(7): 2765–2772. doi: 10.11999/JEIT231187. XIONG Junzhou, LI Guoquan, WANG Yuetao, et al. A reflection modulation system based on reflecting element grouping of active intelligent reflecting surface[J]. Journal of Electronics & Information Technology, 2024, 46(7): 2765–2772. doi: 10.11999/JEIT231187.
[6]	GUAN Xinrong, WU Qingqing, and ZHANG Rui. Joint power control and passive beamforming in IRS-assisted spectrum sharing[J]. IEEE Communications Letters, 2020, 24(7): 1553–1557. doi: 10.1109/LCOMM.2020.2979709.
[7]	LE A T, DO D T, CAO Haotong, et al. Spectrum efficiency design for intelligent reflecting surface-aided IoT systems[C]. GLOBECOM 2022 - 2022 IEEE Global Communications Conference, Rio de Janeiro, Brazil, 2022: 25–30. doi: 10.1109/GLOBECOM48099.2022.10000937.
[8]	YUAN Jie, LIANG Yingchang, JOUNG J, et al. Intelligent Reflecting Surface (IRS)-enhanced cognitive radio system[C]. ICC 2020 - 2020 IEEE International Conference on Communications (ICC), Dublin, Ireland, 2022: 1–6. doi: 10.1109/ICC40277.2020.9148890.
[9]	WANG Zining, LIN Min, HUANG Shupei, et al. Robust beamforming for IRS-aided SWIPT in cognitive radio networks[J]. Digital Communications and Networks, 2023, 9(3): 645–654. doi: 10.1016/j.dcan.2022.10.030.
[10]	LI Guoquan, ZHANG Hui, WANG Yuhui, et al. QoS guaranteed power minimization and beamforming for IRS-assisted NOMA systems[J]. IEEE Wireless Communications Letters, 2023, 12(3): 391–395. doi: 10.1109/LWC.2022.3189272.
[11]	FENG Keming, WANG Qisheng, LI Xiao, et al. Deep reinforcement learning based intelligent reflecting surface optimization for MISO communication systems[J]. IEEE Wireless Communications Letters, 2020, 9(5): 745–749. doi: 10.1109/LWC.2020.2969167.
[12]	HUANG Chongwen, MO Ronghong, and YUEN C. Reconfigurable intelligent surface assisted multiuser MISO systems exploiting deep reinforcement learning[J]. IEEE Journal on Selected Areas in Communications, 2020, 38(8): 1839–1850. doi: 10.1109/JSAC.2020.3000835.
[13]	YANG Helin, XIONG Zehui, ZHAO Jun, et al. Deep reinforcement learning-based intelligent reflecting surface for secure wireless communications[J]. IEEE Transactions on Wireless Communications, 2021, 20(1): 375–388. doi: 10.1109/TWC.2020.3024860.
[14]	ZHONG Canwei, CUI Miao, ZHANG Guangchi, et al. Deep reinforcement learning-based optimization for IRS-assisted cognitive radio systems[J]. IEEE Transactions on Communications, 2022, 70(6): 3849–3864. doi: 10.1109/TCOMM.2022.3171837.
[15]	GUO Jianxin, WANG Zhe, LI Jun, et al. Deep reinforcement learning based resource allocation for intelligent reflecting surface assisted dynamic spectrum sharing[C]. 2022 14th International Conference on Wireless Communications and Signal Processing (WCSP), Nanjing, China, 2022: 1178–1183. doi: 10.1109/WCSP55476.2022.10039119.
[16]	LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[C]. 4th International Conference on Learning Representations, San Juan, Puerto Rico, 2016.
[17]	WEI Yi, ZHAO Mingmin, ZHAO Minjian, et al. Channel estimation for IRS-aided multiuser communications with reduced error propagation[J]. IEEE Transactions on Wireless Communications, 2022, 21(4): 2725–2741. doi: 10.1109/TWC.2021.3115161.
[18]	HAN Yu, TANG Wankai, JIN Shi, et al. Large intelligent surface-assisted wireless communication exploiting statistical CSI[J]. IEEE Transactions on Vehicular Technology, 2019, 68(8): 8238–8242. doi: 10.1109/TVT.2019.2923997.