Joint Task Allocation, Communication Base Station Association and Flight Strategy Optimization Design for Distributed Sensing Unmanned Aerial Vehicles

HE Jiang; YU Wanxin; HUANG Hao; JIANG Weiheng

doi:10.11999/JEIT240738

Volume 47 Issue 5

May 2025

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2025 > 47(5): 1402-1417

HE Jiang, YU Wanxin, HUANG Hao, JIANG Weiheng. Joint Task Allocation, Communication Base Station Association and Flight Strategy Optimization Design for Distributed Sensing Unmanned Aerial Vehicles[J]. Journal of Electronics & Information Technology, 2025, 47(5): 1402-1417. doi: 10.11999/JEIT240738

Citation:

HE Jiang, YU Wanxin, HUANG Hao, JIANG Weiheng. Joint Task Allocation, Communication Base Station Association and Flight Strategy Optimization Design for Distributed Sensing Unmanned Aerial Vehicles[J]. Journal of Electronics & Information Technology, 2025, 47(5): 1402-1417. doi: 10.11999/JEIT240738

Citation:

HE Jiang, YU Wanxin, HUANG Hao, JIANG Weiheng. Joint Task Allocation, Communication Base Station Association and Flight Strategy Optimization Design for Distributed Sensing Unmanned Aerial Vehicles[J]. Journal of Electronics & Information Technology, 2025, 47(5): 1402-1417. doi: 10.11999/JEIT240738

PDF( 3004 KB)

Joint Task Allocation, Communication Base Station Association and Flight Strategy Optimization Design for Distributed Sensing Unmanned Aerial Vehicles

doi: 10.11999/JEIT240738 cstr: 32379.14.JEIT240738

1.
Southwest China Institute of Electronic Technology, Chengdu 610036, China
2.
School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400044, China

Funds: The Scientific and Technological Research Program of Chongqing Municipal Education Commission (KJQN202203101)

Received Date: 2024-08-26
Rev Recd Date: 2025-02-21

Available Online: 2025-03-06

Publish Date: 2025-05-01

Abstract

Abstract

Objective The demand for Unmanned Aerial Vehicles (UAVs) in distributed sensing applications has increased significantly due to their low cost, flexibility, mobility, and ease of deployment. In these applications, the coordination of multi-UAV sensing tasks, communication strategies, and flight trajectory optimization presents a significant challenge. Although there have been preliminary studies on the joint optimization of UAV communication strategies and flight trajectories, most existing work overlooks the impact of the randomly distributed and dynamically updated task airspace model on the optimal design of UAV communication and flight strategies. Furthermore, accurate UAV energy consumption modeling is often lacking when establishing system design goals. Energy consumption during flight, sensing, and data transmission is a critical issue, especially given the UAV’s limited payload capacity and energy supply. Achieving an accurate energy consumption model is essential for extending UAV operational time. To address the requirements of multiple UAVs performing distributed sensing, particularly when tasks are dynamically updated and data must be transmitted to ground base stations, this paper explores the optimal design of joint UAV sensing task allocation, base station association for data backhaul, flight strategy planning, and transmit power control. Methods To coordinate the relationships among UAVs, base stations, and sensing tasks, a protocol framework for multi-UAV distributed task sensing applications is first proposed. This framework divides the UAVs’ behavior during distributed sensing into four stages: cooperation, movement, sensing, and transmission. The framework ensures coordination in the UAVs’ movement to the task area, task sensing, and the backhaul transmission of sensed data. A sensing task model based on dynamic updates, a UAV movement model, a UAV sensing behavior model, and a data backhaul transmission model are then established. A revenue function, combining task sensing utility and task execution costs, is designed, leading to a joint optimization problem of UAV task allocation, communication base station association, and flight strategy. The objective is to maximize the long-term weighted utility-cost. Given that the optimization problem involves high-dimensional decision variables in both discrete and continuous forms, and the objective function is non-convex with respect to these variables, the problem is a typical non-convex Mixed-Integer Non-Linear Programming (MINLP) problem. It falls within the NP-Hard complexity class. Centralized optimization algorithms for this formulation require a central node with high computational capacity and the collection of substantial additional information, such as channel state and UAV location data. This results in high information-interaction overhead and poor scalability. To overcome these challenges, the problem is reformulated as a Markov Game (MG). An effective algorithm is designed by leveraging the distributed coordination concept of Multi-Agent (MA) systems and the exploration capability of deep Reinforcement Learning (RL) within the optimization solution space. Specifically, due to the complex coupling between the continuous and discrete action spaces in the MG problem, a novel solution algorithm called Multi-Agent Independent-Learning Compound-Action Actor-Critic (MA-IL-CA2C) based on Independent Learning (IL) is proposed. The core idea is as follows: first, the independent-learning algorithm is applied to extend single-agent RL to a MA environment. Then, deep learning is used to represent the high-dimensional action and state spaces. To handle the combined discrete and continuous action spaces, the UAV action space is decomposed into discrete and continuous components, with the DQN algorithm applied to the discrete space and the DDPG algorithm to the continuous space. Results and Discussions The computational complexity of action selection and training for the proposed MA-IL-CA2C algorithm is theoretically analyzed. The results show that its complexity is almost equivalent to that of the two benchmark algorithms, DQN and DDPG. Additionally, the performance of the proposed algorithm is simulated and analyzed. When compared with the DQN, DDPG, and Greedy algorithms, the MA-IL-CA2C algorithm demonstrates lower network energy consumption throughout the network operation (Fig. 6), improved system revenue (Fig. 5, Fig. 8, and Fig. 9), and optimized UAV flight strategies (Fig. 7). Conclusions This paper addresses and solves the optimal design problems of joint UAV sensing task allocation, data backhaul base station association, flight strategy planning, and transmit power control for multi-UAV distributed task sensing. A new MA-IL-CA2C algorithm based on IL is proposed. The simulation results show that the proposed algorithm achieves better system revenue while minimizing UAV energy consumption.
- Unmanned Aerial Vehicle(UAV),
- Distributed sensing,
- Joint optimization,
- Reinforcement Learning (RL),
- Markov Game(MG)

FullText(HTML)

References(28)

References

[1]	SHRESTHA R, ROMERO D, and CHEPURI S P. Spectrum surveying: Active radio map estimation with autonomous UAVs[J]. IEEE Transactions on Wireless Communications, 2023, 22(1): 627–641. doi: 10.1109/TWC.2022.3197087.
[2]	NOMIKOS N, GKONIS P K, BITHAS P S, et al. A survey on UAV-aided maritime communications: Deployment considerations, applications, and future challenges[J]. IEEE Open Journal of the Communications Society, 2023, 4: 56–78. doi: 10.1109/OJCOMS.2022.3225590.
[3]	HARIKUMAR K, SENTHILNATH J, and SUNDARAM S. Multi-UAV oxyrrhis marina-inspired search and dynamic formation control for forest firefighting[J]. IEEE Transactions on Automation Science and Engineering, 2019, 16(2): 863–873. doi: 10.1109/TASE.2018.2867614.
[4]	QU Yuben, SUN Hao, DONG Chao, et al. Elastic collaborative edge intelligence for UAV Swarm: Architecture, challenges, and opportunities[J]. IEEE Communications Magazine, 2024, 62(1): 62–68. doi: 10.1109/MCOM.002.2300129.
[5]	ZHANG Tao, ZHU Kun, ZHENG Shaoqiu, et al. Trajectory design and power control for joint radar and communication enabled multi-UAV cooperative detection systems[J]. IEEE Transactions on Communications, 2023, 71(1): 158–172. doi: 10.1109/TCOMM.2022.3224751.
[6]	PAN Hongyang, LIU Yanheng, SUN Geng, et al. Joint power and 3D trajectory optimization for UAV-Enabled wireless powered communication networks with obstacles[J]. IEEE Transactions on Communications, 2023, 71(4): 2364–2380. doi: 10.1109/TCOMM.2023.3240697.
[7]	NGUYEN P X, NGUYEN V D, NGUYEN H V, et al. UAV-assisted secure communications in terrestrial cognitive radio networks: Joint power control and 3D trajectory optimization[J]. IEEE Transactions on Vehicular Technology, 2021, 70(4): 3298–3313. doi: 10.1109/TVT.2021.3062283.
[8]	ZENG Shuhao, ZHANG Hongliang, DI Boya, et al. Trajectory optimization and resource allocation for OFDMA UAV relay networks[J]. IEEE Transactions on Wireless Communications, 2021, 20(10): 6634–6647. doi: 10.1109/TWC.2021.3075594.
[9]	LI Peiming and XU Jie. Fundamental rate limits of UAV-enabled multiple access channel with trajectory optimization[J]. IEEE Transactions on Wireless Communications, 2020, 19(1): 458–474. doi: 10.1109/TWC.2019.2946153.
[10]	GUAN Yue, ZOU Sai, PENG Haixia, et al. Cooperative UAV trajectory design for disaster area emergency communications: A multiagent PPO method[J]. IEEE Internet of Things Journal, 2024, 11(5): 8848–8859. doi: 10.1109/JIOT.2023.3320796.
[11]	SILVIRIANTI, NAROTTAMA B, and SHIN S Y. Layerwise quantum deep reinforcement learning for joint optimization of UAV trajectory and resource allocation[J]. IEEE Internet of Things Journal, 2024, 11(1): 430–443. doi: 10.1109/JIOT.2023.3285968.
[12]	HU Jingzhi, ZHANG Hongliang, SONG Lingyang, et al. Cooperative internet of UAVs: Distributed trajectory design by multi-agent deep reinforcement learning[J]. IEEE Transactions on Communications, 2020, 68(11): 6807–6821. doi: 10.1109/TCOMM.2020.3013599.
[13]	WU Fanyi, ZHANG Hongliang, WU Jianjun, et al. Cellular UAV-to-device communications: Trajectory design and mode selection by Multi-Agent deep reinforcement learning[J]. IEEE Transactions on Communications, 2020, 68(7): 4175–4189. doi: 10.1109/TCOMM.2020.2986289.
[14]	DAI Xunhua, LU Zhiyu, CHEN Xuehan, et al. Multiagent RL-based joint trajectory scheduling and resource allocation in NOMA-assisted UAV swarm network[J]. IEEE Internet of Things Journal, 2024, 11(8): 14153–14167. doi: 10.1109/JIOT.2023.3340669.
[15]	ZHANG Zhongyu, LIU Yunpeng, LIU Tianci, et al. DAGN: A real-time UAV remote sensing image vehicle detection framework[J]. IEEE Geoscience and Remote Sensing Letters, 2020, 17(11): 1884–1888. doi: 10.1109/LGRS.2019.2956513.
[16]	YANG Jun, YOU Xinghui, WU Gaoxiang, et al. Application of reinforcement learning in UAV cluster task scheduling[J]. Future Generation Computer Systems, 2019, 95: 140–148. doi: 10.1016/j.future.2018.11.014.
[17]	NOBAR S K, AHMED M H, MORGAN Y, et al. Resource allocation in cognitive radio-enabled UAV communication[J]. IEEE Transactions on Cognitive Communications and Networking, 2022, 8(1): 296–310. doi: 10.1109/TCCN.2021.3103531.
[18]	CHEN Jiming, LI Junkun, and LAI T H. Energy-efficient intrusion detection with a barrier of probabilistic sensors: Global and local[J]. IEEE Transactions on Wireless Communications, 2013, 12(9): 4742–4755. doi: 10.1109/TW.2013.072313.122083.
[19]	SHAKHOV V V and KOO I. Experiment design for parameter estimation in probabilistic sensing models[J]. IEEE Sensors Journal, 2017, 17(24): 8431–8437. doi: 10.1109/JSEN.2017.2766089.
[20]	YANG Qianqian, HE Shibo, LI Junkun, et al. Energy-efficient probabilistic area coverage in wireless sensor networks[J]. IEEE Transactions on Vehicular Technology, 2015, 64(1): 367–377. doi: 10.1109/TVT.2014.2300181.
[21]	AL-HOURANI A, KANDEEPAN S, and LARDNER S. Optimal LAP altitude for maximum coverage[J]. IEEE Wireless Communications Letters, 2014, 3(6): 569–572. doi: 10.1109/LWC.2014.2342736.
[22]	ZHANG Xinyu and SHIN K G. E-MiLi: Energy-minimizing idle listening in wireless networks[J]. IEEE Transactions on Mobile Computing, 2012, 11(9): 1441–1454. doi: 10.1109/TMC.2012.112.
[23]	ZHU Changxi, DASTANI M, and WANG Shihan. A survey of multi-agent deep reinforcement learning with communication[J]. Autonomous Agents and Multi-Agent Systems, 2024, 38(1): 4. doi: 10.1007/s10458-023-09633-6.
[24]	喻莞芯. 基于多智能体强化学习的无人机集群网络优化设计[D]. [硕士论文], 重庆大学, 2022. doi: 10.27670/d.cnki.gcqdu.2022.001082. YU Wanxin. Optimization design of UAV cluster network based on multi-agent reinforcement learning[D]. [Master dissertation], Chongqing University, 2022. doi: 10.27670/d.cnki.gcqdu.2022.001082.
[25]	SUTTON R S and BARTO A G. Reinforcement Learning: An Introduction[M]. Cambridge, USA: MIT Press, 1998.
[26]	WOOD L F. Training neural networks[P]. US, 4914603A, 1990.
[27]	SIPPER M. A serial complexity measure of neural networks[C]. IEEE International Conference on Neural Networks, San Francisco, USA, 1993: 962–966. doi: 10.1109/ICNN.1993.298687.
[28]	GUO Shaoai and ZHAO Xiaohui. Multi-agent deep reinforcement learning based transmission latency minimization for delay-sensitive cognitive satellite-UAV networks[J]. IEEE Transactions on Communications, 2023, 71(1): 131–144. doi: 10.1109/TCOMM.2022.3222460.