电子与信息学报

Cover

2025, 47(5)

[Abstract](135) [PDF 5444KB](41)

Abstract:

2025, 47(5): 1-4.

[Abstract](119) [FullText HTML] (53) [PDF 253KB](63)

Abstract:

Survey of Unified Representation Technology of Multi-dimensional Information for Low Altitude Intelligent Network

DONG Chao, CUI Can, JIA Ziye, ZHU Yian, ZHANG Lei, WU Qihui

2025, 47(5): 1215-1229. doi: 10.11999/JEIT240835

[Abstract](761) [FullText HTML] (545) [PDF 3859KB](209)

Abstract:
Significance The Low Altitude Intelligent Network (LAIN) has emerged as a critical productive force in recent years, particularly with the growing strategic role of the low-altitude economy in national development plans. As an integral part of smart city infrastructure and advanced air mobility systems, LAIN contributes both to economic growth and to airspace security. By integrating unmanned aerial vehicles, fifth-generation communication technologies, and artificial intelligence, LAIN enables real-time monitoring and provides services for urban traffic, agriculture, and disaster management. This integration optimizes resource allocation and enhances public safety. However, the rapid development of LAIN results in a vast array of distributed aircraft and ground equipment that generate large volumes of heterogeneous data in various formats. The absence of a unified representation standard significantly hinders the efficient utilization of data within the LAIN ecosystem, presenting substantial challenges for its widespread application in complex real-world scenarios. Therefore, the development of a unified data representation model for multi-dimensional and heterogeneous information within LAIN is essential to eliminate data heterogeneity, enhance data utilization efficiency, and promote the deep integration of the low-altitude economy with the digital economy. Process Existing research has explored innovative methods and technologies for information representation and addressing potential challenges in the LAIN. However, current solutions remain domain-specific and lack adaptability to the dynamic environment of LAIN. The absence of targeted research and standards makes it difficult to establish a unified representation for multi-source data. To bridge this gap, a heterogeneous information unified representation model is proposed for LAIN. This paper aims to address the challenges posed by complex data and information in the LAIN environment, particularly within the context of the sixth generation of communication technologies, and to provide new approaches for data management and application in LAIN. First, the heterogeneous data types within LAIN are categorized, highlighting their key characteristics and application scenarios. A platform for LAIN data integration and fusion is then developed, incorporating multiple technologies to facilitate efficient data collection, transmission, processing, and visual display. Additionally, the challenges of achieving a unified representation of multi-dimensional and heterogeneous information within LAIN are analyzed. Finally, promising methods for data fusion and representation are discussed, including data fusion, spatiotemporal gridding data technology, multi-mode technology, and knowledge graphs. These methods aim to establish a unified knowledge representation model and achieve semantic alignment, enabling the integration of data from diverse sources. Specifically, multi-source data are preprocessed to enhance understandability and availability through multi-level fusion, integrating multi-dimensional information from various sensors and data sources within a unified framework. Spatiotemporal gridding standardizes data formats and captures spatiotemporal changes, thereby effectively processing and integrating multi-source, multi-dimensional spatial data. Furthermore, integrating multi-mode data through multi-mode technology is expected to improve decision-making accuracy, while the knowledge graph links multi-source data, constructing a knowledge network that standardizes and correlates information from various sources, formats, and semantics. Prospects With the advancement of multi-dimensional data unified representation technology, the LAIN is poised to integrate with edge computing, radio knowledge description languages, large language models, and other emerging technologies to enable intelligent analysis and autonomous decision-making for low-altitude systems. Specifically, data processing can be optimized through edge computing. By positioning edge devices closer to the terminal, edge computing facilitates preprocessing and preliminary analysis at the data source. This technology enhances response speed and efficiency, providing high-quality services for the rapid acquisition and unified representation of LAIN information. Data from various sensors and systems can be structured and represented in an organized manner, facilitating data exchange between different systems, enabling readable spectrum management policies, and reducing interference incidents. Additionally, large language models can assist in constructing and refining knowledge graphs, advancing the intelligent operation and management of low-altitude aircraft. These promising technologies are expected to support further fusion and unified representation of LAIN data, laying a foundation for future research in the LAIN field. Conclusions This paper systematically addresses the challenges of multi-dimensional data representation in the LAIN through a combination of theoretical innovation and technological integration. The main contributions of this paper include: (1) A summary of related works in the field, with an introduction to potential heterogeneous data types, their key characteristics, and relevant application scenarios. (2) The proposal of a low-altitude information fusion and monitoring system, with an analysis of the challenges in achieving unified data representation. (3) The introduction of key technologies such as data fusion, spatiotemporal gridding data technology, multi-mode technology, and knowledge graphs. Additionally, edge computing technology, radio knowledge description language, and large language model technology are integrated to enhance data fusion and unified representation in LAIN. The findings of this study provide both theoretical and technical support for the development of LAIN, fostering the efficient utilization and intelligent advancement of information resources.

A Review and Prospect of Cybersecurity Research on Air Traffic Management Systems

WANG Buhong, LUO Peng, YANG Yong, ZHAO Zhengyang, DONG Ruochen, GUAN Yongjian

2025, 47(5): 1230-1265. doi: 10.11999/JEIT240966

[Abstract](1435) [FullText HTML] (863) [PDF 10383KB](250)

Abstract:
Significance The air traffic management system is a critical national infrastructure that impacts both aerospace security and the safety of lives and property. With the widespread adoption of information, networking, and intelligent technologies, the modern air traffic management system has evolved into a space-air-ground-sea integrated network, incorporating heterogeneous systems and multiple stakeholders. The network security of the system can no longer be effectively ensured by device redundancy, physical isolation, security by obscurity, or human-in-the-loop strategies. Due to the stringent requirements for aviation airworthiness certification, the implementation of new cybersecurity technologies is often delayed. New types of cyberattacks, such as advanced persistent threats and supply chain attacks, are increasingly prevalent. Vulnerabilities in both hardware and software, particularly in embedded systems and industrial control systems, are continually being exposed, widening the attack surface and increasing the number of potential attack vectors. Cyberattack incidents are frequent, and the network security situation remains critical. Progress The United States’ Next Generation Air Transportation System (NextGen), the European Commission’s Single European Sky Air Traffic Management Research (SESAR), and the Civil Aviation Administration of China have prioritized cybersecurity in their development plans for next-generation air transportation systems. Several countries and organizations, including the United States, Japan, China, the European Union, and Germany, have established frameworks for the information security of air traffic management systems. Although network and information security for air traffic management systems is gaining attention, many countries prioritize operational safety over cybersecurity concerns. Existing security specifications and industry standards are limited in addressing network and information security. Most of them focus on top-level design and strategic directions, with insufficient attention to fundamental theories, core technologies, and key methodologies. Current review literature lacks a comprehensive assessment of assets within air traffic management systems, often focusing only on specific components such as aircraft or airports. Furthermore, research on aviation information security mainly addresses traditional concerns, without fully considering the intelligent and dynamic security challenges facing next-generation air transportation systems. Conclusions This paper comprehensively examines the complexity of the cybersecurity ecosystem in air traffic management systems, considering various entities such as electronic-enabled aircraft, communication, navigation, Surveillance/Air Traffic Management (CNS/ATM), smart airports, and intelligent computing. It focuses on asset categorization, information flow, threat analysis, attack modeling, and defense mechanisms, integrating dynamic flight phases to systematically review the current state of cybersecurity in air traffic management systems. Several scientific issues are identified that must be addressed in constructing a secure ecological framework for air traffic management. Based on the Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK) model, this paper analyzes typical attack examples related to the four ecological entities (Figs. 7, 9, 12, and 14) and constructs an ATT&CK matrix for air traffic management systems (Fig. 15). Additionally, with the intelligent development goal of next-generation air transportation systems as a guide, ten typical applications of intelligent air traffic management are outlined (Fig. 13, Table 11), with a systematic analysis of the attack patterns and defense mechanisms of their intelligent algorithms (Tables 12, 13). These findings provide theoretical references for the development of smart civil aviation and the assurance of cybersecurity in China. Prospects Currently, the cybersecurity ecosystem of air traffic management systems is highly complex, with unclear mechanisms, indistinct boundaries for cybersecurity assets, and incomplete security assurance requirements. Moreover, there is a lack of comprehensive, systematic, and holistic cybersecurity design and defense mechanisms, which limits the ability to counter various subjective, human-driven, and emerging types of malicious cyberattacks. This paper highlights key research challenges in areas such as dynamic cybersecurity analysis, attack impact propagation modeling, human-in-the-loop cybersecurity analysis, and distributed intrusion detection systems. Cybersecurity analysis of air traffic management systems should be conducted within the dynamic operational environment of a space-air-ground-sea integrated network, accounting for the cybersecurity ecosystem and analyzing it across different spatial and temporal dimensions. As aircraft are cyber-physical systems, cybersecurity threat analysis should focus on the interrelated propagation mechanisms between security and safety, as well as their cascading failure models. Furthermore, humans serve as the last line of defense in cybersecurity. When performing threat modeling and risk assessment for avionics systems, it is crucial to fully incorporate “human-in-the-loop” characteristics to derive comprehensive and objective conclusions. Finally, the design, testing, certification, and updating of civil aviation avionics systems are constrained by strict airworthiness requirements, preventing the rapid implementation of advanced cybersecurity technologies. Distributed anomaly detection systems, however, currently represent an effective technical approach for combating cyberattacks in air traffic management systems.

A Survey on Trajectory Planning and Resource Allocation in Unmanned Aerial Vehicle-assisted Edge Computing Networks

WANG Kan, CAO Tielin, LI Xujie, LI Hongyan, LI Meng, ZHOU Momiao

2025, 47(5): 1266-1281. doi: 10.11999/JEIT241071

[Abstract](1220) [FullText HTML] (726) [PDF 1710KB](215)

Abstract:
Significance Unmanned Aerial Vehicle-assisted Mobile Edge Computing (UAV-MEC) is recognized for its flexible deployment, rapid response, wide-area coverage, and distributed computing capabilities, demonstrating significant potential in smart cities, environmental monitoring, and emergency rescue. Traditional ground-based MEC systems, constrained by fixed edge server deployments, are inadequate for dynamic user distributions and remote area demands. The integration of UAVs with MEC is a critical advancement, where dynamic trajectory planning and resource allocation enhance network energy efficiency, computational efficiency, and service quality, supporting the development of low-altitude intelligent networking. This integration addresses the efficient offloading and real-time processing of computation-intensive tasks through air-ground collaborative optimization, providing foundational technical support for future 6G networks and low-altitude economies. Existing surveys in this field predominantly focus on the integration of UAVs and MEC from a resource allocation perspective, while trajectory planning and its joint optimization with resource allocation are largely overlooked. Furthermore, the distinctions between online and offline optimization are insufficiently addressed in existing surveys, necessitating a systematic analysis of current theories and methods to guide future research. Progress In UAV-MEC, joint optimization of trajectory planning and resource allocation has progressed in both online and offline domains. Algorithm frameworks, including alternating optimization and reinforcement learning, have been shown to effectively balance computational complexity with optimization performance. (1) Offline optimization: (a) Energy efficiency optimization: Existing studies employ alternating optimization methods, such as Block Coordinate Descent (BCD) and Successive Convex Approximation (SCA), as well as heuristic algorithms, including differential evolution and dynamic programming, to jointly optimize trajectories, task offloading, and resource allocation, minimizing energy consumption for both users and UAVs. Further reductions in system energy consumption are achieved by integrating Wireless Power Transfer (WPT) and Reconfigurable Intelligent Surfaces (RIS). (b) Latency optimization: Non-Orthogonal Multiple Access (NOMA) and task scheduling strategies are utilized to minimize user-perceived latency. A multi-UAV collaborative framework based on game theory and reinforcement learning is proposed. (c) Multi-objective optimization: The Dinkelbach method is introduced to address fractional programming problems, facilitating joint optimization of computational efficiency, throughput, and secure capacity. Digital Twin (DT) technology is integrated to approximate global optimality. (2) Online optimization: (a) Lyapunov framework: Long-term stochastic problems are decoupled into per-slot optimizations through temporal decomposition. Convex optimization is combined with dynamic trajectory and resource allocation adjustments to adapt to time-varying channels and user mobility. (b) Reinforcement learning: Multi-Agent Deep Reinforcement Learning (MADRL) is applied to multi-UAV collaboration, with expert knowledge guidance and noise injection incorporated to accelerate algorithm convergence. (3) Hybrid optimization: A “pre-planning + online adjustment” strategy is proposed. In the offline phase, clustering algorithms and particle swarm optimization are used to generate high-quality samples for training Deep Neural Networks (DNNs). In the online phase, incremental learning is applied to dynamically fine-tune DNNs for unknown scenarios, balancing global planning with real-time responsiveness. Conclusions Despite notable advancements, several critical challenges in UAV-assisted MEC remain unresolved: (1) Incomplete future state information: The formulation of offline optimization problems typically assumes full knowledge of environmental state information over future time horizons. However, in multi-UAV scenarios involving multi-dimensional parameters, acquiring complete and accurate state information across extended periods remains difficult, limiting the applicability of offline methods. (2) Real-time multi-UAV coordination: Enhancing system efficiency and task completion quality requires real-time coordination among UAVs. This process demands extensive information exchange within UAV swarms, complex obstacle avoidance, and high-dimensional control adjustments. Collaborative computation offloading and multi-UAV trajectory planning remain challenging due to their inherent complexity. (3) Security vulnerabilities in air-ground links: The line-of-sight propagation and open transmission environment of UAV-assisted MEC networks expose offloaded data to risks such as eavesdropping, data tampering, and signal interference. Current approaches predominantly rely on physical-layer security, whereas active defense mechanisms against emerging threats, including deep-fake signal attacks, are still underdeveloped. (4) Lack of integration across air-space-ground networks: The absence of standardized interfaces and unified cross-domain resource scheduling protocols hinders the coordination of spectrum, computing, and caching resources between satellites and UAV-MEC systems. This limitation restricts the realization of globally orchestrated heterogeneous networks. (5) Energy constraints and service quality tradeoff: UAV endurance directly affects service coverage and operational sustainability. Although energy efficiency is emphasized in both offline and online optimization strategies, a fundamental tradeoff persists between energy consumption and edge service quality. These technical bottlenecks continue to restrict the transition of UAV-MEC from theoretical frameworks to large-scale real-world deployment. Prospects Future research is expected to progress in the following directions: (1) Intelligent environmental perception will advance through the integration of Gated Recurrent Units (GRUs) or Temporal Convolutional Networks (TCNs) for dynamic parameter prediction, while Generative Adversarial Networks (GANs) will be leveraged to complement incomplete environmental state information. (2) Multi-UAV collaboration and energy efficiency will be enhanced through the development of service migration mechanisms and DT-driven MADRL frameworks, combined with solar/WPT technologies to improve UAV endurance. (3) Secure communication mechanisms will evolve with the combination of beamforming, physical-layer security techniques, and homomorphic encryption-based task offloading protocols to address eavesdropping and data tampering in air-ground channels. (4) Heterogeneous network integration will focus on exploring a “cloud-edge-device-satellite” architecture to expand UAV-MEC coverage and robustness in 6G networks, with the development of satellites-assisted cross-domain resource scheduling algorithms. (5) Green computing paradigms will emerge through the integration of energy harvesting and service migration mechanisms to reduce computing loads, promoting sustainable low-altitude intelligent computing ecosystems.

Jointly Optimized Deployment and Power for Unmanned Aerial Vehicle - Satellite Assisted Cell-Free Massive MIMO Systems

ZHAO Haitao, LIU Ying, WANG Qin, LIU Miao, ZHU Hongbo

2025, 47(5): 1282-1290. doi: 10.11999/JEIT240058

[Abstract](460) [FullText HTML] (212) [PDF 2041KB](111)

Abstract:
Objective This study addresses persistent limitations in resource availability, cognitive adaptability, and spatial coverage in traditional Cell-Free massive Multiple-Input Multiple-Output (CF-mMIMO) systems. A novel framework is proposed that integrates power control and Unmanned Aerial Vehicle (UAV) placement within a Low Earth Orbit (LEO) satellite-assisted downlink architecture. The objective is to enhance communication efficiency and system robustness in coverage-constrained wireless environments, particularly under dynamic user distributions and challenging propagation conditions. Methods The proposed framework adopts a hybrid optimization model that jointly considers user association, power allocation, and UAV deployment, based on the known spatial distribution of ground users and access points. With LEO satellite support, the architecture extends coverage and strengthens transmission links. The optimization problem aims to maximize the minimum achievable user data rate, subject to constraints on coverage, power, and cross-layer interference. Owing to the nonconvex and coupled nature of the variables, an iterative algorithm is developed using block coordinate descent and successive convex approximation. The original problem is decomposed into three interdependent subproblems—user association, power allocation, and UAV positioning—which are solved alternately to obtain a near-optimal solution. Results and Discussions Simulation results confirm that the proposed framework significantly improves system-wide throughput, communication robustness, and spectral efficiency. Compared with conventional CF-mMIMO systems, the integration of UAVs and LEO satellites enhances adaptability to non-uniform user distributions and challenging wireless environments. The strategy enables real-time adjustment of UAV positions and transmission power, improving load balancing, reducing interference, and expanding service coverage. Performance metrics, including the minimum user rate and total system capacity, demonstrate the proposed method’s effectiveness in complex, heterogeneous network settings. Conclusions This study proposes a scalable and adaptive approach for next-generation communication networks by integrating aerial and satellite components into terrestrial CF-mMIMO systems. The combination of intelligent UAV deployment and adaptive power control enables efficient resource management while maintaining high reliability and wide-area coverage. The proposed strategy represents a promising direction for future air-space-ground integrated networks, supporting high-throughput, energy-efficient, and resilient wireless services in both urban and remote scenarios.

Collaborative Multi-agent Trajectory Optimization for Unmanned Aerial Vehicles Under Low-altitude Mixed-obstacle Airspace

FENG Simeng, ZHANG Yunyi, LIU Kai, LI Baolong, DONG Chao, ZHANG Lei, WU Qihui

2025, 47(5): 1291-1300. doi: 10.11999/JEIT250012

[Abstract](907) [FullText HTML] (243) [PDF 1636KB](194)

Abstract:
Objective The rapid expansion of the low-altitude economy has driven the development of low-altitude intelligent networks as a key component of the Internet of Things (IoT). In such networks, the growing number of users challenges the ability of Unmanned Aerial Vehicles (UAVs) with mobile base stations to sustain data transmission quality. Efficient access technologies are therefore essential to ensure service quality as user density increases. At the same time, the growing complexity of airspace elevates the risk of in-flight collisions, necessitating integrated strategies to improve both communication efficiency and flight safety. This study proposes a collaborative trajectory planning framework for multiple UAVs operating in low-altitude, mixed-obstacle environments. The approach incorporates Non-Orthogonal Multiple Access (NOMA) to increase spectral efficiency and communication capacity, together with a discrete collision probability map for obstacle avoidance. A novel multi-UAV communication and obstacle-avoidance model is developed, and an optimized Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm is introduced to schedule users and plan UAV trajectories. The objective is to maximize communication energy efficiency while ensuring reliable obstacle avoidance. The proposed method effectively enhances multi-UAV coordination in complex airspace and improves the overall communication performance. Methods To ensure energy efficiency and reliable obstacle avoidance for multiple UAVs operating in low-altitude, mixed-obstacle environments, a multi-user communication system model is proposed, incorporating collaborative multi-UAV trajectory planning. This model comprises two key components. First, a collision probability model based on discrete obstacles extends the conventional low-altitude obstacle representation into a probabilistic collision map. Second, a multi-user communication framework is constructed using fractional-order transmission energy allocation under NOMA, integrating both UAV communication and flight energy models within a unified UAV energy efficiency framework. Based on this model, the problem of maximizing energy efficiency is formulated, accounting for coordinated UAV communication and obstacle avoidance. To solve this problem, an integrated strategy is proposed. A multi-agent direction-preprocessing K-means++ algorithm is first used to enhance convergence during user scheduling optimization. Based on the optimized user allocation and environmental awareness, a state space is defined together with a 3D action space consisting of 27 directional movement options. The MADDPG algorithm is then trained by alternately updating Actor and Critic networks over the defined state-action space. Once trained, the network outputs trajectory planning policies that achieve both effective obstacle avoidance and optimized communication energy efficiency. Results and Discussions The proposed trajectory planning framework applies a user scheduling algorithm that dynamically allocates users at each time step, incorporating the positions of other UAVs, obstacles, and associated collision probabilities as environmental inputs. The MADDPG network is trained using a reward function defined by energy efficiency and collision probability, enabling the generation of trajectory planning solutions that maintain both communication performance and flight safety for multiple UAVs. Simulation results show that the planned trajectories—depicted by red, yellow, and blue lines—are shorter on average than those obtained using the traditional safety radius method (Fig. 3). Compared with trajectory planning approaches based on varying safety radius values, the proposed method achieves an approximately 8-fold reduction in average collision probability (Fig. 5). In terms of communication performance, the NOMA-based approach significantly outperforms Frequency-Division Multiple Access (FDMA). Furthermore, the proposed algorithm, incorporating multi-agent direction preprocessing optimization, yields an average improvement of 10.81% in communication energy efficiency over the non-optimized variant, as evaluated by the mean across multiple iterations (Fig. 6). The network also demonstrates rapid environmental adaptation within 20 training iterations and exhibits superior generalization compared to conventional reward-based reinforcement learning algorithms (Fig. 4). Conclusions This paper presents a multi-UAV collaborative communication and trajectory planning solution for ensuring both flight safety and communication performance in low-altitude mixed-obstacle airspace during multi-user operations. A UAV collaborative NOMA communication system model, based on a collision probability map, is developed. An optimized MADDPG algorithm for user scheduling is introduced to address the multi-UAV trajectory planning problem, aiming to maximize communication energy efficiency. The algorithm comprises two key components: firstly, a user scheduling algorithm based on K-means++ to establish user-UAV connection relationships; secondly, the MADDPG algorithm, which generates UAV trajectory planning solutions under dynamic environmental conditions and established connection relationships. Simulation results reveal the following key findings: (1) The optimized MADDPG algorithm enhances multi-UAV communication while ensuring flight safety; (2) The proposed algorithm significantly improves obstacle avoidance performance, reducing collision probability by approximately 8-fold compared to traditional methods; (3) The inclusion of multi-agent direction preprocessing optimizes communication energy efficiency by 10.81%. However, this study only considers a low-altitude environment with mixed static obstacles. In real-world scenarios, obstacles may move or intrude dynamically, and future work should explore the impact of dynamic obstacles on trajectory planning.

A Decision-making Method for UAV Conflict Detection and Avoidance System

TANG Xinmin, LI Shuai, GU Junwei, GUAN Xiangmin

2025, 47(5): 1301-1309. doi: 10.11999/JEIT240503

[Abstract](820) [FullText HTML] (500) [PDF 2206KB](196)

Abstract:
Objective With the rapid increase in UAV numbers and the growing complexity of airspace environments, Detect-and-Avoid (DAA) technology has become essential for ensuring airspace safety. However, the existing Detection and Avoidance Alerting Logic for Unmanned Aircraft Systems (DAIDALUS) algorithm, while capable of providing basic avoidance strategies, has limitations in handling multi-aircraft conflicts and adapting to dynamic, complex environments. To address these challenges, integrating the DAIDALUS output strategies into the action space of a Markov Decision Process (MDP) model has emerged as a promising approach. By incorporating an MDP framework and designing effective reward functions, it is possible to enhance the efficiency and cost-effectiveness of avoidance strategies while maintaining airspace safety, thereby better meeting the needs of complex airspaces. This research offers an intelligent solution for UAV avoidance in multi-aircraft cooperative environments and provides theoretical support for the coordinated management of shared airspace between UAVs and manned aircraft. Methods The guidance logic of the DAIDALUS algorithm dynamically calculates the UAV’s collision avoidance strategy based on the current state space. These strategies are then used as the action space in an MDP model to achieve autonomous collision avoidance in complex flight environments. The state space in the MDP model includes parameters such as the UAV's position, speed, and heading angle, along with dynamic factors like the relative position and speed of other aircraft or potential threats. The reward function is crucial for ensuring the UAV balances flight efficiency and safety during collision avoidance. It accounts for factors such as success rewards, collision penalties, proximity to target point rewards, and distance penalties to optimize decision-making. Additionally, the discount factor determines the weight of future rewards, balancing the importance of immediate versus future rewards. A lower discount factor typically emphasizes immediate rewards, leading to faster avoidance actions, while a higher discount factor encourages long-term flight safety and resource consumption. Results and Discussions The DAIDALUS algorithm calculates the UAV’s collision avoidance strategy based on the current state space, which then serves as the action space in the MDP model. By defining an appropriate reward function and state transition probabilities, the MDP model is established to explore the impact of different discount factors on collision avoidance. Simulation results show that the optimal flight strategy, calculated through value iteration, is represented by the red trajectory (Fig. 7). The UAV completes its flight in 203 steps, while the comparative experiment trajectory (Fig. 8) consists of 279 steps, demonstrating a 27.2% improvement in efficiency. When the discount factor is set to 0.99 (Fig. 9, Fig. 10), the UAV selects a path that balances immediate and long-term safety, effectively avoiding potential collision risks. The airspace intrusion rate is 5.8% (Fig. 11, Fig. 12), with the closest distance between the threat aircraft and the UAV being 343 meters, which meets the safety requirements for UAV operations. Conclusions This paper addresses the challenge of UAV collision avoidance in complex environments by integrating the DAIDALUS algorithm with a Markov Decision Process model. The proposed decision-making method enhances the DAIDALUS algorithm by using its guidance strategies as the action space in the MDP. The method is evaluated through multi-aircraft conflict simulations, and the results show that: (1) The proposed method improves efficiency by 27.2% over the DAIDALUS algorithm; (2) Long-term and short-term rewards are considered by selecting a discount factor of 0.99 based on the relationship between the discount factor and reward values at each time step; (3) In multi-aircraft conflict scenarios, the UAV effectively handles various conflicts and maintains a safe distance from threat aircraft, with a clear airspace intrusion rate of only 5.8%. However, this study only considers ideal perception capabilities, and real-world flight conditions, including sensor noise and environmental variability, should be accounted for in future work.

A Localization Algorithm for Multiple Radiation Sources in Low-altitude Intelligent Networks Based on Sparse Tensor Completion and Density Peaks Clustering

CHEN Zhibo, GUO Daoxing

2025, 47(5): 1310-1321. doi: 10.11999/JEIT241050

[Abstract](390) [FullText HTML] (200) [PDF 3706KB](77)

Abstract:
Objective This paper addresses key technologies for multi-source localization in low-altitude intelligent networks, aiming to achieve precise spatial localization of multiple unknown radiation sources in dynamic low-altitude environments. The localization is based on signal strength data collected by spectrum monitoring devices mounted on Unmanned Aerial Vehicles (UAVs). Traditional localization methods encounter three major challenges in practical applications: significant spatial sparsity of measurement data due to the constrained flight trajectories of UAVs, signal strength fluctuations caused by environmental noise and shadow fading, and exponential increases in algorithm complexity as the number of unknown radiation sources grows. These factors lead to a substantial decline in localization performance in dynamic low-altitude scenarios, highlighting the need for a more robust multi-source localization framework. Methods To address these issues, this study proposes a collaborative localization algorithm that integrates sparse tensor completion with an improved Density Peak Clustering (DPC) method. The proposed approach decomposes multi-source localization into two progressive stages: three-dimensional tensor reconstruction and density peak detection. First, the sparse measurement data from UAVs are modeled as a three-dimensional sparse tensor containing spatial coordinates and signal strength, fully characterizing the spatial distribution of signals in the target area. A tensor completion network based on convolutional autoencoders is then designed to intelligently infer the signal strength in unmeasured regions through deep feature learning, effectively alleviating the data sparsity issue. Based on the reconstructed complete signal distribution, an improved DPC algorithm is introduced. By incorporating an adaptive truncation distance to optimize local density calculations and constructing a decision graph using Mahalanobis distance, the algorithm accurately identifies density peaks (i.e., radiation source locations) and suppresses outliers. Results and Discussions The innovation of this method is reflected in the following three aspects: (1) Enhanced noise robustness: By reconstructing the signal spatial distribution through tensor completion and eliminating pseudo-peaks caused by noise interference using DPC clustering. Under noise power conditions of –20 dBm, the algorithm achieved a missed detection probability of 16.62% and a false alarm probability of 11.13%, while maintaining an average localization error of 12.15 m (Fig. 11, Fig. 12); (2) Improved weak signal detection capability: By utilizing local density features rather than traditional signal strength threshold detection, the localization performance for low-power radiation sources was improved. Under conditions with radiation source transmission power of 5 dBm to 10 dBm and at a 30% sampling rate, the algorithm achieved a missed detection probability of 3.12% and a false alarm probability of 3.56%, significantly outperforming two baseline algorithms (Fig. 9, Fig. 10); (3) Optimized multi-source resolution performance: Simulation experiments demonstrated that in scenarios with 10 coexisting radiation sources, the method achieved an average localization error of 6.42m, representing a 46.94% improvement over the existing best method’s performance of 12.10 meters. Additionally, the fluctuation in localization error across scenarios with 2 to 10 radiation sources was maintained within ±9% (Fig. 7, Fig. 8). Conclusions This study constructs a two-stage localization framework, “tensor completion-density clustering,” which combines radio map estimation with the improved DPC algorithm for the first time, addressing the challenges of sparse measurement, noise interference, and multi-source coupling in low-altitude scenarios. The proposed algorithm can reconstruct the three-dimensional signal strength distribution from sparse measurement data obtained by UAVs and accurately localize multiple unknown radiation sources. It maintains strong performance under complex conditions, such as sparse measurements, environmental noise, and multi-source scenarios. This method provides a practical and robust solution for UAV spectrum monitoring applications. The technology offers theoretical support for tasks such as the rapid traceability of interference sources in emergency communications and collaborative spectrum sensing in UAV swarms, with significant application potential in areas such as smart city aerial monitoring and battlefield electromagnetic situational awareness.

Research on Security, Privacy, and Energy Efficiency in Unmanned Aerial Vehicle-Assisted Federal Edge Learning Communication Systems

LU Weidang, FENG Kai, DING Yu, LI Bo, ZHAO Nan

2025, 47(5): 1322-1331. doi: 10.11999/JEIT240847

[Abstract](583) [FullText HTML] (326) [PDF 3720KB](135)

Abstract:
Objective Unmanned Aerial Vehicle-Assisted Federal Edge Learning (UAV-Assisted FEL) communication addresses the data isolation problem and mitigates data leakage risks in terminal devices. However, eavesdroppers may exploit model updates in FEL to recover original private data, significantly threatening the system’s privacy and security. Methods To address this issue, this study proposes a secure aggregation and resource optimization scheme for UAV-Assisted FEL communication systems. Terminal devices train local models using local data and update parameters, which are transmitted to a global UAV. The UAV aggregates these parameters to generate new global model parameters. Eavesdroppers attempt to intercept the transmitted parameters to reconstruct the original data. To enhance security-privacy energy efficiency, the transmission bandwidth, CPU frequency, and transmit power of terminal devices, along with the CPU frequency of the UAV, are jointly optimized. An evolutionary Deep Deterministic Policy Gradient (DDPG) algorithm is proposed to solve this optimization problem. The algorithm intelligently interacts with the system to achieve secure aggregation and resource optimization while meeting latency and energy consumption requirements. Results and Discussions The simulation results validate the effectiveness of the proposed scheme. The experiments evaluate the effects of the scheme on key performance metrics, including system cost, secure transmission rate, and secure privacy energy efficiency, from multiple perspectives. As shown in (Fig. 2), with an increasing number of terminal devices, system cost, secure transmission rate, and secure privacy energy efficiency all increase. These results indicate that the proposed scheme ensures system security and enhances energy efficiency, even in multi-device scenarios. As shown in (Fig. 3), under varying global iteration counts, the system balances latency and energy consumption by either extending the duration to lower energy consumption or increasing energy consumption to reduce latency. The secure transmission rate rises with the number of global iterations, as fewer iterations allow the system to tolerate higher energy consumption and latency per iteration, leading to reduced transmission power from terminal devices to meet system constraints. Additionally, secure privacy energy efficiency improves with increasing global iterations, further demonstrating the scheme’s capacity to ensure system security and reduce system cost as global iterations increase. As shown in (Fig. 4), during UAV flight, secure privacy energy efficiency fluctuates, with higher secure transmission rates observed when the communication environment between terminal devices and the UAV is more favorable. As shown in (Fig. 5), the proposed scheme is compared with two baseline schemes: Scheme 1, which minimizes system latency, and Scheme 2, which minimizes system energy consumption. The proposed scheme significantly outperforms both baselines in cost overhead. Scheme 1 achieves a slightly higher secure transmission rate than the proposed scheme due to its focus on minimizing latency at the expense of higher energy consumption. Conversely, Scheme 2 shows a considerably lower secure transmission rate as it prioritizes minimizing energy consumption, resulting in lower transmission power and compromised secure transmission rates. The results indicate that the secure privacy energy efficiency of the proposed scheme significantly exceeds that of the baseline schemes, further demonstrating its effectiveness. Conclusions To enhance data transmission security and reduce system costs, this paper proposes a secure aggregation and resource optimization scheme for UAV-Assisted FEL. Under constraints of limited computational and communication resources, the scheme jointly optimizes the transmission bandwidth, CPU frequency, and transmission power of terminal devices, along with the CPU frequency of the UAV, to maximize the secure privacy energy efficiency of the UAV-Assisted FEL system. Given the complexity of the time-varying system and the strong coupling of multiple optimization variables, an advanced DDPG algorithm is developed to solve the optimization problem. The problem is first modeled as a Markov Decision Process, followed by the construction of a reward function positively correlated with the secure privacy energy efficiency objective. The proposed DDPG network then intelligently generates joint optimization variables to obtain the optimal solution for secure privacy energy efficiency. Simulation experiments evaluate the effects of the proposed scheme on key system performance metrics from multiple perspectives. The results demonstrate that the proposed scheme significantly outperforms other benchmark schemes in improving secure privacy energy efficiency, thereby validating its effectiveness.

Multi-Model Fusion-Based Abnormal Trajectory Correction Method for Unmanned Aerial Vehicles

WANG Wei, SHE Dingchen, WANG Jiaqi, HAN Dairu, JIN Benzhou

2025, 47(5): 1332-1344. doi: 10.11999/JEIT241026

[Abstract](399) [FullText HTML] (337) [PDF 3338KB](60)

Abstract:
Objective The opening of low-altitude airspace and the widespread deployment of Unmanned Aerial Vehicles (UAVs) have significantly increased low-altitude flight activities. Trajectory planning is essential for ensuring UAVs operate safely in complex environments. However, wireless remote control links are vulnerable to interference and spoofing attacks, leading to deviations from planned trajectories and posing serious safety risks. To mitigate these risks, UAV position parameters can be predicted and used to replace erroneous navigation system values, thereby correcting abnormal trajectories. Existing prediction-based correction methods, however, exhibit low efficiency and error accumulation over long-term predictions, limiting their practical application. To address these limitations, this study proposes a multi-model fusion method to improve the efficiency and accuracy of abnormal trajectory correction, providing a robust solution for real-world UAV operations. Methods An Long Short-Term Memory (LSTM)-Transformer prediction model, integrating LSTM and Transformer, is proposed to exploit the strengths of both architectures in time series forecasting. LSTM efficiently captures short-term dependencies in sequential data, whereas Transformer is well-suited for modeling long-term dependencies. By combining these architectures, the proposed model enhances the capture of both short-term and long-term dependencies, reducing prediction errors. The overall framework of the LSTM-Transformer prediction model is illustrated in (Fig. 3). The input time series data undergoes preprocessing before being fed into the LSTM and Transformer sub-models, each generating a corresponding feature vector. These feature vectors are concatenated and further processed by a fully connected layer to extract intrinsic data features, ultimately producing the prediction results. To further optimize the model, a blockwise attention strategy is proposed. The detailed computation process is shown in (Fig. 4). During self-attention calculations in the Transformer sub-model, the input sequence is divided into multiple sub-blocks, allowing for parallel computation. The results are then concatenated to obtain the final output. This approach effectively reduces the computational complexity of the Transformer sub-model while improving the efficiency of abnormal trajectory correction. The blockwise attention strategy not only enhances computational efficiency but also maintains prediction accuracy, making it a crucial component of the proposed method. Results and Discussions Experiments are conducted using a public dataset to predict UAV positional parameters, including longitude, latitude, and altitude. The dataset’s feature parameters are presented in (Table 1). The trajectory correction performance of the proposed method is evaluated and compared with other correction methods using Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE). (Fig. 5) and (Fig. 6) present the error metrics of the proposed method in comparison with Support Vector Regression (SVR), CNN-LSTM, and LSTM-RF under different prediction step sizes and measurement noise standard deviation conditions. The results indicate that the proposed method achieves the lowest correction errors. At a prediction step size of 20 and a measurement noise standard deviation of 0.19, the proposed method achieves RMSE, MAE, and MAPE values of 0.2971, 0.2208, and 21.688%, respectively. Compared with SVR, CNN-LSTM, and LSTM-RF, the RMSE is reduced by 39.52%, 6.22%, and 20.65%, the MAE by 45.5%, 8.46%, and 20.52%, and the MAPE by 8.955%, 2.03%, and 3.532%, respectively. (Fig. 7) and (Fig. 8) compare the proposed method with the original LSTM-Transformer, the Transformer with the blockwise attention optimization strategy, and individual LSTM and Transformer models in terms of error metrics under different prediction steps and measurement noise standard deviation conditions. When the prediction step is 20 and the measurement noise standard deviation is 0.19, the proposed method achieves RMSE reductions of 12.23%, 4.07%, 1.36%, and 3.48%, MAE reductions of 19.36%, 6.76%, 3.83%, and 4.21%, and MAPE reductions of 3.84%, 3.616%, 2.075%, and 2.087%, compared to the other four correction methods. These findings demonstrate the superior performance of the proposed method in reducing trajectory correction errors. The runtime efficiency of the proposed method under different prediction steps is evaluated, as shown in (Fig. 9). With a prediction step size of 20, the proposed method completes the prediction in 0.699 s, which is 35.87% faster than the original LSTM-Transformer model. This confirms that the blockwise attention optimization strategy enhances correction efficiency. Finally, (Fig. 10) presents trajectory comparisons, illustrating the accuracy of the proposed method. The predicted trajectories closely align with actual trajectories, outperforming baseline methods in correcting UAV abnormal trajectories under various conditions. Conclusions The proposed multi-model fusion method for UAV abnormal trajectory correction enhances correction efficiency and reduces errors more effectively than benchmark methods. The results demonstrate that the method achieves accurate and reliable trajectory correction, making it suitable for practical UAV applications.

Differentially Private Federated Learning Based Wideband Spectrum Sensing for the Low-Altitude Unmanned Aerial Vehicle Swarm

DONG Peihao, JIA Jibin, ZHOU Fuhui, WU Qihui

2025, 47(5): 1345-1355. doi: 10.11999/JEIT241042

[Abstract](644) [FullText HTML] (234) [PDF 3030KB](87)

Abstract:
Objective Wideband Spectrum Sensing (WSS) for Unmanned Aerial Vehicles (UAVs) in low-altitude intelligent networks is essential for efficient spectrum monitoring and utilization. However, sampling at the Nyquist rate incurs high hardware and computational costs. Moreover, the high mobility of UAVs subjects them to rapidly changing spectral environments, which significantly reduces sensing accuracy and presents major challenges for UAV-based WSS. Methods A low-complexity Feature-Splitting Wideband Spectrum Sensing neural Network (FS-WSSNet) is proposed to achieve high sensing accuracy while reducing the operational cost of UAVs through sub-Nyquist sampling. To integrate spectral knowledge and computational resources across multiple UAVs and enable adaptation to varying spectrum environments, an online model adaptation algorithm based on Differential Privacy Federated Transfer Learning (DPFTL) is further proposed. Before model parameters are uploaded to a central computation platform, noise is added according to local differential privacy constraints. This enables spectrum knowledge sharing while preserving data privacy within the UAV swarm, allowing FS-WSSNet on each UAV to rapidly adapt to dynamic spectral conditions. Results and Discussions Simulation results demonstrate the effectiveness of the proposed FS-WSSNet and the DPFTL-based online model adaptation algorithm. FS-WSSNet achieves substantially higher prediction accuracy than the comparison models, confirming that omitting convolutional layers degrades performance and supporting the design rationale of FS-WSSNet (Fig. 3). In addition, FS-WSSNet consistently outperforms the baseline scheme across all Signal-to-Noise Ratio (SNR) conditions (Fig. 4). Its Receiver Operating Characteristic (ROC) curve, which lies closer to the top-left corner, indicates improved detection performance across various thresholds (Fig. 5). FS-WSSNet also exhibits significantly lower computational complexity compared with the baseline (Table 1). Furthermore, under the proposed DPFTL-based scheme (Algorithm 1), FS-WSSNet maintains robust performance across different target scenarios without requiring local adaptation samples. This approach not only preserves data privacy but also improves the model’s generalization ability (Figs. 6～9). Conclusions This study proposes a cooperative WSS scheme based on DPFTL for low-altitude UAV swarms. First, data received by UAVs are processed using multicoset sampling to enable cost-efficient sub-Nyquist acquisition. The resulting signals are input into a low-complexity FS-WSSNet for accurate and efficient spectrum detection. An online model adaptation algorithm based on DPFTL is then developed, introducing noise to model parameters before upload to ensure data privacy. By supporting spectrum knowledge sharing and collaborative training, the algorithm effectively integrates the computational and data resources of multiple UAVs to construct a robust model adaptable to various scenarios. Simulation results confirm that the proposed scheme provides an efficient WSS solution for resource-constrained low-altitude UAV networks, achieving both privacy protection and adaptability across scenarios.

Multi-Hop UAV Ad Hoc Network Access Control Protocol: Deep Reinforcement Learning-Based Time Slot Allocation Method

SONG Liubin, GUO Daoxing

2025, 47(5): 1356-1367. doi: 10.11999/JEIT241044

[Abstract](432) [FullText HTML] (231) [PDF 2571KB](75)

Abstract:
Objective Unmanned Aerial Vehicle (UAV) ad hoc networks have gained prominence in emergency and military operations due to their decentralized architecture and rapid deployment capabilities. However, the coexistence of saturated and unsaturated nodes in dynamic multi-hop topologies often results in inefficient time-slot utilization and network congestion. Existing Time Division Multiple Access (TDMA) protocols show limited adaptability to dynamic network conditions, while conventional Reinforcement Learning (RL)-based approaches primarily target single-hop or static scenarios, failing to address scalability challenges in multi-hop UAV networks. This study explores dynamic access control strategies that allow idle time slots of unsaturated nodes to be efficiently shared by saturated nodes, thereby improving overall network throughput. Methods A Deep Q-Learning-based Multi-Hop TDMA (DQL-MHTDMA) protocol is developed for UAV ad hoc networks. First, a backbone selection algorithm classifies nodes into saturated (high-traffic) and unsaturated (low-traffic) groups. The saturated nodes are then aggregated into a joint intelligent agent coordinated through UAV control links. Second, a distributed Deep Q-Learning (DQL) framework is implemented in each TDMA slot to dynamically select optimal transmission node sets from the saturated group. Two reward strategies are defined: (1) throughput maximization and (2) energy efficiency optimization. Third, the joint agent autonomously learns network topology and the traffic patterns of unsaturated nodes, adaptively adjusting transmission probabilities to meet the targeted objectives. Upon detecting topological changes, the agent initiates reconfiguration and retraining cycles to reconverge to optimal operational states. Results and Discussions Experiments conducted in static (16-node) and mobile (32-node) scenarios demonstrate the protocol’s effectiveness. As the number of iterations increases, the throughput gradually converges towards the theoretical optimum, reaching its maximum after approximately 2,000 iterations (Fig. 5). In Slot 4, the total throughput achieves the theoretical optimum of 1.8, while the throughput of Node 4 remains nearly zero. This occurs because the agent selects transmission sets {1, 8} or {2, 8} to share the channel, with transmissions from Node 1 preempting Node 4’s sending opportunities. Similarly, the total throughput of Slot 10 also attains the theoretical optimum of 1.8, resulting from the algorithm’s selection of conflict-free transmission sets {1} or {2} to share the channel simultaneously. The throughput of the DQL-MHTDMA algorithm is compared with that of other algorithms and the theoretical optimal value in odd-numbered time slots under Scenario 1. Across all time slots, the proposed algorithm achieves or closely approximates the theoretical optimum, significantly outperforming the traditional fixed-slot TDMA algorithm and the CF-MAC algorithm. Notably, the intelligent agent operates without prior knowledge of traffic patterns in each time slot or the topology of nodes beyond its own, demonstrating the algorithm’s ability to learn both slot occupancy patterns and network topology. This enables it to intelligently select the optimal transmission set to maximize throughput in each time slot. In the mobile (32-node) scenario, when the relay selection algorithm detects significant topological changes, the protocol is triggered to reselect actions. After each change, the algorithm rapidly converges to optimal action selection schemes and adaptively achieves near-theoretical-optimum maximum throughput across varying topologies (Fig. 9). Under the optimal energy efficiency objective policy, energy efficiency in time slot 11 converges after 2,000 iterations, reaching a value close to the theoretical optimum (Fig. 10). Compared to the throughput-oriented algorithm, energy efficiency improves from 0.35 to 1. This occurs because the throughput-optimized algorithm preferentially selects transmission sets {1, 8} or {2, 8} to maximize throughput. However, as Node 11 lies within the 2-hop neighborhood of both Nodes 1 and 8, concurrent channel occupancy induces collisions, significantly degrading energy efficiency. In contrast, the energy-efficiency-optimized algorithm preferentially selects an empty transmission set (i.e., no scheduled transmissions), thereby maximizing energy efficiency while maintaining moderate throughput levels. The paper presents statistical comparisons of energy efficiency against theoretical optima across eight distinct time slots in the static (16-node) scenario. As demonstrated in multi-hop network environments, the proposed algorithm achieves or closely approaches theoretical optimum energy efficiency values in all slots. Furthermore, while maintaining energy efficiency guarantees, the algorithm delivers significantly higher throughput compared to conventional TDMA protocols. Conclusions This paper addresses the access control problem in multi-hop UAV ad hoc networks, where saturated and non-saturated nodes coexist. A DQL-MHTDMA is proposed. By consolidating saturated nodes into a single large agent, the protocol learns network topology and time-slot occupation patterns to select optimal access actions, thereby maximizing throughput or energy efficiency in each time slot. Simulation results demonstrate that the algorithm exhibits fast convergence, stable performance, and achieves the theoretically optimal values for both throughput and energy efficiency objectives.

A Model-Assisted Federated Reinforcement Learning Method for Multi-UAV Path Planning

LU Yin, LIU Jinzhi, ZHANG Min

2025, 47(5): 1368-1380. doi: 10.11999/JEIT241055

[Abstract](1000) [FullText HTML] (569) [PDF 2215KB](126)

Abstract:
Objective The rapid advancement of low-altitude Internet of Things (IoT) applications has increased the demand for efficient sensor data acquisition. Unmanned Aerial Vehicles (UAVs) have emerged as a viable solution due to their high mobility and deployment flexibility. However, existing multi-UAV path planning algorithms show limited adaptability and coordination efficiency in dynamic and complex environments. To overcome these limitations, this study develops a model-assisted approach that constructs a hybrid simulated environment by integrating channel modeling with position estimation. This strategy reduces the interaction cost between UAVs and the real world. Building on this, a federated reinforcement learning-based algorithm is proposed, which incorporates a maximum entropy strategy, monotonic value function decomposition, and a federated learning framework. The method is designed to optimize two objectives: maximizing the data collection rate and minimizing the flight path length. The proposed algorithm provides a scalable and efficient solution for cooperative multi-UAV path planning under dynamic and uncertain conditions. Methods This study formulates the multi-UAV path planning problem as a multi-objective optimization task and models it using a Decentralized Partially Observable Markov Decision Process (Dec-POMDP) to address dynamic environments with partially unknown device positions. To improve credit assignment and exploration efficiency, enhanced reinforcement learning algorithms are developed. The exploration capacity of individual agents is increased using a maximum entropy strategy, and a dynamic entropy regularization mechanism is incorporated to avoid premature convergence. To ensure global optimality of the cooperative strategy, the method integrates monotonic value function decomposition based on the QMIX algorithm. A multi-dimensional reward function is designed to guide UAVs in balancing competing objectives, including data collection, path length, and device exploration. To reduce interaction costs in real environments, a model-assisted training framework is established. This framework combines known information with neural networks to learn channel characteristics and applies an improved particle swarm algorithm to estimate unknown device locations. To enhance generalization, federated learning is employed to aggregate local experiences from multiple UAVs into a global model through periodic updates. In addition, an attention mechanism is introduced to optimize inter-agent information aggregation, improving the accuracy of collaborative decision-making. Results and Discussions Simulation results demonstrate that the proposed algorithm converges more rapidly and with reduced volatility (red curves in Fig. 3 and Fig. 4), due to a 70% reduction in interactions with the real environment achieved by the model-assisted framework. The federated learning mechanism further enhances policy generalization through global model aggregation. Under test conditions with an initial energy of 50～80 J, the data collection rate increases by 2.1～7.4%, and the flight path length decreases by 6.9～14.4% relative to the baseline model (Fig. 6 and Fig. 7), confirming the effectiveness of the reward function and exploration strategy (Fig. 5). The attention mechanism allows UAVs to identify dependencies among sensing targets and cooperative agents, improving coordination. As shown in Fig. 2, the UAVs dynamically partition the environment to cover undiscovered devices, reducing path overlap and significantly improving collaborative efficiency. Conclusions This study proposes a model-assisted multi-UAV path planning method that integrates maximum entropy reinforcement learning, the QMIX algorithm, and federated learning to address the multi-objective data collection problem in complex environments. By incorporating modeling, dynamic entropy adjustment, and an attention mechanism within the Dec-POMDP framework, the approach effectively balances exploration and exploitation while resolving collaborative credit assignment in partially observable settings. The use of federated learning for distributed training and model sharing reduces communication overhead and enhances system scalability. Simulation results demonstrate that the proposed algorithm achieves superior performance in data collection efficiency, path optimization, and training stability compared with conventional methods. Future work will focus on coordination of heterogeneous UAV clusters and robustness under uncertain communication conditions to further support efficient data collection for low-altitude IoT applications.

Networking and Resource Allocation Methods for Opportunistic UAV-assisted Data Collection

SUN Weihao, WANG Hai, QIN Zhen, QU Yuben

2025, 47(5): 1381-1391. doi: 10.11999/JEIT241053

[Abstract](340) [FullText HTML] (200) [PDF 2594KB](42)

Abstract:
Objective Unmanned Aerial Vehicles (UAVs) tasked with customized operations, such as environmental monitoring and intelligent logistics, are referred to as opportunistic UAVs. These UAVs, while traversing the task area, can be leveraged by ground nodes in regions that are either uncovered or heavily loaded, enabling them to function as data storage. This reduces the operational costs associated with deploying dedicated UAVs for data collection. In practice, however, the flight paths of opportunistic UAVs are uncontrolled, and the data-uploading capabilities of ground nodes in various regions vary. To enhance efficiency, ground nodes can actively form a network, pre-aggregate data, and allocate resources to cluster head nodes located advantageously for data transmission. Despite extensive research into networking technologies, two key challenges remain. First, existing studies predominantly focus on static networking strategies, overlooking the reliability of data aggregation in mobile scenarios. Ground nodes involved in tasks such as emergency response, disaster relief, or military reconnaissance may exhibit mobility. The dynamic topology of these mobile nodes, coupled with non-line-of-sight transmission path loss and severe signal fading, creates substantial challenges for reliable transmission, leading to bit errors, packet losses, and retransmissions. Therefore, mobile ground nodes must dynamically adjust their subnet data transmission strategies based on the time-varying relative distances between cluster members and heads. Second, most studies focus on data aggregation capacity within subnets but fail to consider the uploading capabilities of cluster heads. In opportunistic communication scenarios, where UAV flight paths are uncontrolled, the data-uploading capacity of each subnet is constrained by the minimum of the data collected, aggregation capacity, and uploading capability. Therefore, effective networking strategies for opportunistic UAV-assisted data collection must account for the relationships between cluster members, cluster heads, and UAVs. Coordinated resource allocation and subnet formation strategies are essential to improving system performance. In summary, exploring networking and resource allocation methods for opportunistic UAV-assisted data collection is of significant practical importance. Methods Due to the interdependent nature of the subnet data transmission, resource allocation, and formation strategies, the problem presents a large state space that is difficult to solve directly. To address this, a decomposition approach is applied. First, given the subnet formation strategy, the paper sequentially derives the closed-form solutions for the subnet data transmission and resource allocation strategies, significantly simplifying the original problem. Next, the subnet formation subproblem is modeled as a formation game. An altruistic networking criterion is proposed, and using potential game theory, it is proven that the formulated game has at least one pure strategy Nash equilibrium. A subnet formation strategy based on the best response method is proposed. Finally, the convergence and complexity of the proposed algorithm are analyzed. Results and Discussions Simulation results confirm the effectiveness of the proposed algorithm. As shown in the networking diagram, the algorithm predominantly selects nodes near the flight path as cluster heads due to their superior data uploading capabilities (Fig. 2, Fig. 3(a)). The data uploaded is constrained by the minimum values of the data collected, data aggregation capacity, and data uploading capacity, creating a bottleneck. In this context, the algorithm balances subnet data aggregation and uploading capacities, ultimately improving transmission efficiency (Fig. 3(b)). Additionally, the relationship between distance and subnet data transmission strategy is evaluated. Specifically, the proposed transmission strategy reduces the amount of data aggregated for reliability as the distance increases, while increasing data aggregation for efficiency when the distance decreases (Fig. 4). This dynamic transmission approach enhances reliability as the amount of aggregated data fluctuates (Fig. 5(a)). Furthermore, the proposed algorithm outperforms benchmark networking schemes with increasing iteration numbers, demonstrating up to a 56.3% improvement (Fig. 5(b)). Finally, regardless of variations in flight speed, the proposed algorithm consistently shows superior transmission efficiency (Fig. 5(c)). Conclusions This paper explores terrestrial networking and resource allocation methods to enhance the transmission efficiency of opportunistic UAV-assisted data collection. The strategies for subnet data transmission, resource allocation, and formation are jointly addressed. The paper derives closed-form solutions for the subnet data transmission and resource allocation strategies sequentially, followed by the formulation of the subnet formation strategy as a formation game, which is solved using the best response method. Extensive simulation results validate the performance improvements. However, this study considers only scenarios with a single opportunistic UAV. In practical applications, multiple UAVs may coexist, requiring further analysis of the time-varying relationships between cluster heads and UAVs in future work.

Low-complexity MRC Receiver Algorithm Based on OTFS System

WANG Zhenduo, JI Tianzhi, SUN Rongchen

2025, 47(5): 1392-1401. doi: 10.11999/JEIT241056

[Abstract](356) [FullText HTML] (176) [PDF 2610KB](91)

Abstract:
Objective Ultra-high-speed mobile applications—such as Unmanned Aerial Vehicles (UAVs), high-speed railways, satellite communications, and vehicular networks—place increasing demands on communication systems, particularly under high-Doppler conditions. Orthogonal Time Frequency Space (OTFS) modulation offers advantages in such environments due to its robustness against Doppler effects. However, conventional receiver algorithms rely on computationally intensive matrix operations, which limit their efficiency and degrade real-time performance in high-mobility scenarios. This paper proposes a low-complexity Maximum Ratio Combining (MRC) receiver for OTFS systems that avoids matrix inversion by exploiting the structural characteristics of OTFS channel matrices in the Delay-Doppler (DD) domain. The proposed receiver achieves high detection performance while substantially reducing computational complexity, supporting practical deployment in ultra-high-speed mobile communication systems. Methods The proposed low-complexity receiver algorithm applies MRC in the DD domain to iteratively extract and coherently combine multipath components. This approach enhances Bit Error Rate (BER) performance by optimizing signal aggregation while avoiding computationally intensive operations. To further reduce complexity, the algorithm incorporates interleaving and deinterleaving operations that restructure the channel matrix into a sparse upper triangular Heisenberg form. This transformation enables efficient matrix decomposition and facilitates simplified processing. To address the computational burden associated with matrix inversion during symbol detection, a low-complexity LDL decomposition algorithm is introduced. Compared with conventional matrix inversion techniques, this method substantially reduces computational overhead. Furthermore, a low-complexity inversion method for lower triangular matrices is implemented to further improve efficiency during the decision process. Simulation results confirm that the proposed receiver achieves BER performance comparable to that of traditional MRC algorithms while significantly lowering computational complexity. Results and Discussions Simulation results confirm that the proposed low-complexity MRC receiver achieves BER performance comparable to that of conventional MRC receivers while substantially improving computational efficiency under high-mobility conditions (Fig. 3). The algorithm is evaluated across a range of environments, including scenarios characterized by high-speed motion and complex multipath interference. It outperforms Linear Minimum Mean Square Error (LMMSE) equalizers and Gauss–Seidel iterative equalization algorithms. Despite its reduced complexity, the proposed receiver maintains the same BER performance as traditional MRC methods. The algorithm demonstrates effective scalability as the number of symbols and subcarriers increases. Under conditions of increased system complexity, the receiver sustains computational efficiency without performance degradation (Fig. 4, Fig. 5). These results support its suitability for practical deployment in high-speed mobile communication systems employing OTFS modulation. The receiver also exhibits strong resilience to variations in wireless channel models. Across both typical urban multipath scenarios and high-velocity vehicular conditions, it maintains stable BER performance (Fig. 8). In addition, the receiver demonstrates robust tolerance to Doppler shift fluctuations and variable noise levels. These characteristics enable its application in dynamic environments with rapidly changing channel conditions. The algorithm’s efficiency and performance stability make it particularly well suited for real-time implementation in ultra-high-mobility networks, including UAV systems, high-speed rail communications, and other next-generation wireless platforms. By reducing computational complexity without compromising detection accuracy, the proposed receiver supports large-scale deployment of OTFS-based systems, addressing key performance and scalability challenges in emerging communication infrastructures. Conclusions This study proposes a low-complexity MRC receiver algorithm for OTFS systems. By introducing an interleaver and deinterleaver, the channel matrix is transformed into a sparse upper triangular form, enabling efficient inversion with reduced computational cost. In addition, the receiver integrates a low-complexity LDL decomposition algorithm and an upper triangular matrix inversion method to further minimize the computational burden associated with matrix operations. Simulation results confirm that the proposed receiver achieves equivalent BER performance to conventional MRC receivers. Moreover, under identical channel conditions, it demonstrates superior BER performance relative to linear receivers.

Joint Task Allocation, Communication Base Station Association and Flight Strategy Optimization Design for Distributed Sensing Unmanned Aerial Vehicles

HE Jiang, YU Wanxin, HUANG Hao, JIANG Weiheng

2025, 47(5): 1402-1417. doi: 10.11999/JEIT240738

[Abstract](532) [FullText HTML] (343) [PDF 3004KB](55)

Abstract:
Objective The demand for Unmanned Aerial Vehicles (UAVs) in distributed sensing applications has increased significantly due to their low cost, flexibility, mobility, and ease of deployment. In these applications, the coordination of multi-UAV sensing tasks, communication strategies, and flight trajectory optimization presents a significant challenge. Although there have been preliminary studies on the joint optimization of UAV communication strategies and flight trajectories, most existing work overlooks the impact of the randomly distributed and dynamically updated task airspace model on the optimal design of UAV communication and flight strategies. Furthermore, accurate UAV energy consumption modeling is often lacking when establishing system design goals. Energy consumption during flight, sensing, and data transmission is a critical issue, especially given the UAV’s limited payload capacity and energy supply. Achieving an accurate energy consumption model is essential for extending UAV operational time. To address the requirements of multiple UAVs performing distributed sensing, particularly when tasks are dynamically updated and data must be transmitted to ground base stations, this paper explores the optimal design of joint UAV sensing task allocation, base station association for data backhaul, flight strategy planning, and transmit power control. Methods To coordinate the relationships among UAVs, base stations, and sensing tasks, a protocol framework for multi-UAV distributed task sensing applications is first proposed. This framework divides the UAVs’ behavior during distributed sensing into four stages: cooperation, movement, sensing, and transmission. The framework ensures coordination in the UAVs’ movement to the task area, task sensing, and the backhaul transmission of sensed data. A sensing task model based on dynamic updates, a UAV movement model, a UAV sensing behavior model, and a data backhaul transmission model are then established. A revenue function, combining task sensing utility and task execution costs, is designed, leading to a joint optimization problem of UAV task allocation, communication base station association, and flight strategy. The objective is to maximize the long-term weighted utility-cost. Given that the optimization problem involves high-dimensional decision variables in both discrete and continuous forms, and the objective function is non-convex with respect to these variables, the problem is a typical non-convex Mixed-Integer Non-Linear Programming (MINLP) problem. It falls within the NP-Hard complexity class. Centralized optimization algorithms for this formulation require a central node with high computational capacity and the collection of substantial additional information, such as channel state and UAV location data. This results in high information-interaction overhead and poor scalability. To overcome these challenges, the problem is reformulated as a Markov Game (MG). An effective algorithm is designed by leveraging the distributed coordination concept of Multi-Agent (MA) systems and the exploration capability of deep Reinforcement Learning (RL) within the optimization solution space. Specifically, due to the complex coupling between the continuous and discrete action spaces in the MG problem, a novel solution algorithm called Multi-Agent Independent-Learning Compound-Action Actor-Critic (MA-IL-CA2C) based on Independent Learning (IL) is proposed. The core idea is as follows: first, the independent-learning algorithm is applied to extend single-agent RL to a MA environment. Then, deep learning is used to represent the high-dimensional action and state spaces. To handle the combined discrete and continuous action spaces, the UAV action space is decomposed into discrete and continuous components, with the DQN algorithm applied to the discrete space and the DDPG algorithm to the continuous space. Results and Discussions The computational complexity of action selection and training for the proposed MA-IL-CA2C algorithm is theoretically analyzed. The results show that its complexity is almost equivalent to that of the two benchmark algorithms, DQN and DDPG. Additionally, the performance of the proposed algorithm is simulated and analyzed. When compared with the DQN, DDPG, and Greedy algorithms, the MA-IL-CA2C algorithm demonstrates lower network energy consumption throughout the network operation (Fig. 6), improved system revenue (Fig. 5, Fig. 8, and Fig. 9), and optimized UAV flight strategies (Fig. 7). Conclusions This paper addresses and solves the optimal design problems of joint UAV sensing task allocation, data backhaul base station association, flight strategy planning, and transmit power control for multi-UAV distributed task sensing. A new MA-IL-CA2C algorithm based on IL is proposed. The simulation results show that the proposed algorithm achieves better system revenue while minimizing UAV energy consumption.

Aerial Target Intention Recognition Method Integrating Information Classification Processing and Multi-scale Embedding Graph Robust Learning with Noisy Labels

SONG Zihao, ZHOU Yan, CAI Yichao, CHENG Wei, YUAN Kai, LI Hui

2025, 47(5): 1418-1433. doi: 10.11999/JEIT241074

[Abstract](389) [FullText HTML] (270) [PDF 5745KB](90)

Abstract:
Objective Aerial Target Intention Recognition (ATIR) predicts and assesses the intentions of non-cooperative targets by integrating information acquired and processed by various sensors. Accurate recognition enhances decision-making, aiding commanders and combatants in steering engagements favorably. Therefore, robust and precise recognition methods are essential. Advances in big data and detection technologies have driven research into deep-learning-based intention recognition. However, noisy labels in target intention recognition datasets hinder the reliability of traditional deep-learning models. To address this issue, this study proposes an intention recognition method incorporating Information Classification Processing (ICP) and multi-scale robust learning. The trained model demonstrates high accuracy even in the presence of noisy labels. Methods This method integrates an ICP network, a cross-scale embedding fusion mechanism, and multi-scale embedding graph learning. The ICP network performs cross-classification processing by analyzing attribute correlations and differences, facilitating the extraction of embeddings conducive to intention recognition. The cross-scale embedding fusion mechanism employs target sequences at different scales to train multiple Deep Neural Networks (DNNs) simultaneously. It sequentially integrates robust embeddings from fine to coarse scales. During training, complementary information across scales enables a cross-teaching strategy, where each encoder selects clean-label samples based on a small-loss criterion. Additionally, multi-scale embedding graph learning establishes relationships between labeled and unlabeled samples to correct noisy labels. Specifically, for high-loss unselected samples, the Speaker-listener Label PropagAtion (SLPA) algorithm refines their labels using the multi-scale embedding graph, improving model adaptation to the class distribution of target attribute sequences. Results and Discussions When the proportion of symmetric noise is 20% (Table 1), the test accuracy of the Cross-Entropy (CE) method exceeds 80%, demonstrating the effectiveness of the ICP network. The proposed method achieves both test accuracy and a Macro F1 score (MF1) above 92%. At higher noise levels—50% symmetric noise and 40% asymmetric noise (Table 1)—the performance of other methods declines significantly. In contrast, the proposed method maintains accuracy and MF1 above 80%, indicating greater stability and robustness. This strong performance can be attributed to: (1) Cross-scale fusion, which integrates complementary information from different scales, enhancing the separability and robustness of fused embeddings. This ensures the selection of high-quality samples and prevents performance degradation caused by noisy labels in label propagation. (2) SLPA in multi-scale embedding graph learning, which stabilizes label propagation even when the dataset contains a high proportion of noisy labels. Conclusions This study proposes an intelligent method for recognizing aerial target intentions in the presence of noisy labels. The method effectively addresses noise label by integrating an ICP network, a cross-scale embedding fusion mechanism, and multi-scale embedding graph learning. First, an embedding extraction encoder based on the ICP network is constructed using acquired target attributes. The cross-scale embedding fusion mechanism then integrates encoder outputs from sequences at different scales, facilitating the extraction of multi-scale features and enhancing the reliability of clean samples identified by the small-loss criterion. Finally, multi-scale embedding graph learning, incorporating SLPA, refines noisy labels by leveraging selected clean labels. Experiments on the ATIR dataset across various noise types and levels demonstrate that the proposed method achieves significantly higher test accuracy and M F1 than other baseline approaches. Ablation studies further validate the effectiveness and robustness of the network architecture and mechanisms.

Wide-Area Multilateration Time Synchronization Method Based on Signal Arrival Time Modeling

TANG Xinmin, ZHOU Yang, LU Qixing, GUAN Xiangmin

2025, 47(5): 1434-1449. doi: 10.11999/JEIT240670

[Abstract](257) [FullText HTML] (213) [PDF 6472KB](41)

Abstract:
Objective Wide-Area Multilateration (WAM), a high-precision positioning technology currently under nationwide deployment, is widely applied in aircraft positioning on airport surfaces and in terminal areas. However, as WAM depends on collaborative signal processing across multiple stations, challenges such as time synchronization and computational complexity continue to constrain positioning accuracy. This study develops a mathematical model for time synchronization and “same-message” extraction based on Time Of Arrival (TOA), achieving synchronization by calculating the “synchronized start time” of ground sensors. The proposed method offers low computational complexity and is straightforward to implement. To enhance TOA estimation accuracy and reduce synchronization error, a joint filtering strategy—Variable Moving Average Filtering and Kalman (VMAF-Kalman)—is proposed to minimize TOA counting deviations introduced by clock drift. The model addresses synchronization challenges in distributed station deployments and employs joint filtering to correct initial clock source deviations. Methods This study addresses the challenge of high-precision TOA acquisition by proposing a joint filtering method that combines VMAF-Kalman. This approach filters the phase difference count between the GPS 1 Pulse Per Second (1PPS) signal and the local crystal oscillator 1PPS signal, producing a stable reference clock to mitigate the effects of noise and oscillator aging that induce clock drift. Therefore, stable TOA counting with a precision of 2.5 ns is achieved in Field-Programmable Gate Arrays (FPGAs). To resolve synchronization issues in distributed WAM systems, a time synchronization model based on TOA is proposed, which determines the synchronized start time of remote stations. Additionally, a same-message extraction model is developed to identify the TOA of identical messages, enabling accurate multilateration positioning. Results and Discussions Two experiments evaluate the proposed method and model: a filtering performance comparison and an actual flight trajectory positioning experiment for time synchronization validation. The latter includes two simulation scenarios: Scenario 1 consists of drone positioning tests, and Scenario 2 consists of civil aviation aircraft positioning tests. The simulation results indicate that the joint filtering method outperforms single filtering approaches, reducing TOA counting errors by 36.84% and 25.36% in the respective scenarios. Both the drone and civil aviation tests demonstrate high positioning accuracy, with errors and update rates meeting standard requirements. These findings confirm the practicality of the proposed method and the improved synchronization accuracy of the model. Conclusions Firstly, the proposed VMAF-Kalman joint filtering method demonstrates clear advantages over single filtering algorithms in both performance and hardware efficiency. Simulation results show that the output of the PID controller remains within a narrower fluctuation range, while TOA counting errors are reduced by 36.84% and 25.36%, respectively. These findings confirm that joint filtering stabilizes clock signals, improves TOA counting accuracy in FPGAs, and reduces synchronization errors. Secondly, the time synchronization and same-message extraction models developed in this study simplify existing synchronization methods by enabling WAM synchronization and TOA extraction through algorithmic computation alone. Simulations incorporating actual flight data in low-altitude airspace, verified across multiple positioning algorithms, further validate the model. Drone test results show that vertical Root Mean Square Error (RMSE) and deviation remain within 20 m, with horizontal RMSE below 10 m. For civil aviation aircraft, all algorithms achieved accuracy rates above 80%, with average errors under 300 m and position update intervals within 5 s, meeting established standards. The experimental outcomes confirm the feasibility and applicability of the proposed model for high-precision WAM time synchronization.

Distributionally Robust Task Offloading for Low-Altitude Intelligent Networks

JIA Ziye, JIANG Guanwang, CUI Can, ZHANG Lei, WU Qihui

2025, 47(5): 1450-1460. doi: 10.11999/JEIT240799

[Abstract](476) [FullText HTML] (180) [PDF 3024KB](87)

Abstract:
Objective The rapid development of Low-Altitude Intelligent Networks (LAINs) and the widespread adoption of Multi-access Edge Computing (MEC) have introduced challenges related to the random variability in task data sizes, which constrains the efficiency of LAIN-assisted MEC networks. Although task offloading has been extensively studied, most existing research overlooks the uncertainty in task sizes. This randomness can lead to unexpected outages and inefficient resource utilization, making it difficult to meet quality-of-service requirements. Distributionally Robust Optimization (DRO) based on uncertainty sets is a promising approach to addressing these challenges. By formulating and solving a DRO problem that accounts for task uncertainties, this study provides a robust and conservative solution applicable to various LAIN-related scenarios. Methods This study proposes an LAIN-assisted MEC network comprising multiple hovering Unmanned Aerial Vehicles (UAVs), a High-Altitude Platform (HAP), and Ground Users (GUs). To accurately model task size randomness, three probabilistic distance metrics—L₁, L_∞ and Fortet-Mourier (FM)—are introduced to construct uncertainty sets based on historical data. A DRO problem is then formulated using these uncertainty sets to optimize task offloading decisions within the proposed network. The objective is to minimize system latency under the worst-case probability distribution of task sizes, thereby enhancing system robustness. The proposed DRO problem, structured as a minimization-maximization mixed-integer programming model, is solved iteratively through decomposition into an inner and an outer problem. The inner problem, a linear programming problem, is addressed using standard solvers such as GUROBI. For the outer problem, the low-complexity Branch and Bound (BB) method is employed to solve the integer programming component efficiently by systematically exploring subsets of the solution space and pruning infeasible regions using upper and lower bounds. To handle large-scale and multi-constraint scenarios, a heuristic Binary Whale Optimization Algorithm (BWOA) is further integrated to accelerate convergence. Therefore, the Distributionally Robust Task Offloading Optimization Algorithm (DRTOOA) is developed by combining BB and BWOA. Initially, BB determines a subset of binary variables, followed by BWOA optimization for the remaining variables. This process is repeated iteratively until convergence is achieved. Results and Discussions The performance of the proposed DRTOOA is evaluated through numerical simulations. System latency is analyzed under three probabilistic distance metrics used for constructing uncertainty sets (Fig. 3). As the tolerance of the uncertainty sets increases, system latency rises across all metrics. Notably, the latency achieved via DRTOOA is lower than that obtained using the Exhaustive Method (EM) but higher than that using the BB method, demonstrating its robustness against uncertainties. In terms of computational efficiency, DRTOOA outperforms other benchmark algorithms by achieving the shortest latency, highlighting its effectiveness in solving large-scale problems (Fig. 4). Among the three probabilistic distance metrics, the FM metric yields the lowest system latency with relatively stable performance as the tolerance changes (Fig. 5). Additionally, the impact of uncertainty set tolerance on the probability distribution of task sizes is examined (Fig. 6). As the tolerance decreases, the task size distribution aligns more closely with the reference distribution. Conversely, increasing the tolerance results in a higher probability of larger task sizes. Notably, optimization based on the FM probabilistic distance metric exhibits greater stability under varying tolerances. Furthermore, the impact of HAP quota limitations and the number of GUs on system latency are analyzed (Figs. 7 and 8). System latency decreases as HAP quotas increase, indicating that additional HAP resources alleviate task processing pressure. Conversely, an increase in the number of GUs leads to higher system latency due to the greater computational demand. Overall, DRTOOA effectively optimizes system latency and demonstrates superior performance compared with other baseline algorithms in terms of robustness and computational efficiency. Conclusions This study addresses the task offloading problem in LAIN-assisted MEC networks, considering the uncertainty in task sizes. By constructing uncertainty sets based on different probabilistic distance metrics and formulating a DRO problem, the DRTOOA is proposed, effectively integrating the BB method with the BWOA. Simulation results demonstrate that: (1) Compared with the BB method and the EM, DRTOOA effectively reduces system latency, demonstrating higher efficiency in problem-solving. (2) Among the three probabilistic distance metrics—FM distance, L₁ norm distance, and L_∞ norm distance—the FM metric exhibits the most stability, yielding the lowest system latency under the same conditions. (3) System latency is influenced by factors such as the tolerance of uncertainty sets, HAP quota limitations, and the number of GUs. However, this study assumes static or quasi-static network nodes for simplification, limiting the consideration of UAV flexibility and dynamicity. Future research should explore the impact of UAV and HAP mobility, as well as real-world factors such as communication interference and equipment failures, on task offloading decisions and overall system performance.

Research on Signal Detection of Adaptive O-OFDM Symbol Decomposition in Rough Set Information System

JIA Kejun, CHE Jiaqi, LIU Jiaxin, XIAN Yuqin, QIN Cuicui, YANG Boran

2025, 47(5): 1461-1473. doi: 10.11999/JEIT240864

[Abstract](232) [FullText HTML] (169) [PDF 5484KB](28)

Abstract:
Objective Adaptive Optical Orthogonal Frequency Division Multiplexing Symbol Decomposition with Serial Transmission (O-OFDM-ASDST) effectively suppresses the nonlinear clipping distortion of Light-Emitting Diodes (LEDs) in Visible Light Communication (VLC). However, incorporating decomposition symbols in the O-OFDM-ASDST system incorporates Additive White Gaussian Noise (AWGN), which degrades the Bit Error Rate (BER). To address this issue, this study proposes an O-OFDM-ASDST signal detection algorithm based on the Rough Set Theory (RST) information system and the indiscernibility relation of artificial intelligence particle computing. Methods An O-OFDM-ASDST signal detection algorithm is proposed based on the RST information system and the indiscernibility relation of artificial intelligence particle computing. The algorithm consists of two stages: RST preprocessing and attribute reduction reconstruction. In the first stage, the RST information system is constructed by using preprocessed time-domain sampled values as theoretical domain data. The signal characteristics of these sampled values are converted into symbolic attributes, serving as the conditional attributes of the RST information system, while the upper and lower amplitude thresholds are designated as decision attributes. The RST attribute dependence formula, combined with an attribute importance-based addition and deletion method, is applied to establish decision rules and classify the information system. In the second stage, the indiscernibility relation is derived from the decision rules, and attribute reduction is performed on the constructed information system. This reduction process is applied to the time-domain sampled values within the upper and lower thresholds, followed by reconstruction. Results and Discussions The performance of the proposed detection algorithm is verified using the Monte Carlo simulation method. The results demonstrate that this algorithm effectively suppresses AWGN at the O-OFDM-ASDST receiver, enhances BER performance, and significantly reduces computational complexity and processing delay. For instance, when the PhotoDetector (PD) is positioned at the center of the room [3,3,0.85], the ACO-OFDM-ASDST system achieves Signal-to-Noise Ratio (SNR) gains of approximately 1 dB and 1.2 dB under 4QAM and 16QAM modulation, respectively, at a BER of 10^–5. The DCO-OFDM-ASDST system achieves SNR gains of approximately 1 dB and 2 dB under the same conditions (Fig. 7). Similarly, when the PD is located at the edge of the room [0.5,0.5,0.85], the ACO-OFDM-ASDST system achieves SNR gains of approximately 0.8 dB and 1.1 dB for 4QAM and 16QAM, respectively, at a BER of 10^–5, while the DCO-OFDM-ASDST system achieves SNR gains of approximately 2.5 dB and 3.2 dB (Fig. 8). The proposed detection algorithm also maintains favorable BER performance under different DC bias levels. For example, under 16QAM modulation with DC bias values of 0.3 V, 0.4 V, and 0.6 V, the DCO-OFDM-ASDST system achieves SNR gains of approximately 2.2 dB, 2 dB, and 0.8 dB, respectively, at a BER of 10^–5 (Fig. 9). Furthermore, the complexity of the proposed detection algorithm in the ACO-OFDM-ASDST system is only one-tenth that of the contrast signal detection algorithm (Fig. 12). As the number of symbol decompositions increases, the proposed algorithm requires fewer computing resources compared to the contrast detection algorithm. For instance, in the ACO-OFDM-ASDST system with 4QAM, 16QAM, and 64QAM modulation, when the number of symbol decompositions is 4, the computational resources required by the contrast detection algorithm amount to 4096, whereas those required by the proposed detection algorithm are 408, 600, and 736, respectively. This corresponds to reductions in computational resource consumption by factors of 1/10, 1/7, and 1/6, respectively (Fig. 13). Additionally, the proposed detection algorithm exhibits lower processing latency. Conclusions The O-OFDM-ASDST signal detection algorithm is implemented using the RST information system and the indiscernibility relation, effectively suppressing AWGN in decomposed symbols. The simulation results confirm the effectiveness of the proposed algorithm, demonstrating superior BER performance compared to other signal detection methods. Notably, high BER performance is maintained even at the room’s edge, highlighting the algorithm’s reliability, coverage, and robustness. Additionally, the proposed algorithm exhibits low complexity and reduced processing delay. It not only mitigates LED nonlinear distortion but also effectively suppresses AWGN in decomposition symbols, thereby enhancing BER transmission performance and improving the overall O-OFDM system performance.

Dynamic Spectrum Access Algorithm for Evaluating Spectrum Stability in Cognitive Vehicular Networks.

MA Bin, YANG Zumin, XIE Xianzhong

2025, 47(5): 1474-1485. doi: 10.11999/JEIT240927

[Abstract](321) [FullText HTML] (186) [PDF 3837KB](62)

Abstract:
Objective With the exponential growth of vehicle terminals and the widespread adoption of cognitive vehicular network applications, the existing licensed spectrum resources are inadequate to meet the communication demands of Cognitive Vehicular Networks (CVN). The rapid development of CVN and the increasing complexity of vehicular communication scenarios have intensified spectrum resource scarcity. Dynamic Spectrum Access (DSA) technology has emerged as a key solution to alleviate this scarcity by enabling efficient use of underutilized spectrum bands. While current DSA algorithms ensure basic spectrum utilization, they struggle to comprehensively evaluate spectrum stability and meet the differentiated stability requirements of vehicular network applications. For example, safety-critical applications such as collision avoidance systems demand ultra-reliable, low-latency communication, while infotainment applications prioritize high throughput. This paper proposes a novel framework integrating spectrum stability assessment with deep reinforcement learning. The framework constructs a multi-dimensional parameter-based model for spectrum stability, designs a reinforcement learning architecture incorporating gated mechanisms and dueling neural networks, and establishes a dynamically adaptive reward function to enable intelligent spectrum resource allocation. This research offers a solution for vehicular network spectrum management that combines theoretical depth with practical engineering value, paving the way for more reliable and efficient vehicular communication systems. Methods This study employs an integrated approach to address the spectrum allocation challenges in CVN. A time-series prediction model is developed using Long Short-Term Memory (LSTM) neural networks, which leverage three-dimensional time-series data of Signal-to-Noise Ratio (SNR), Received Signal Strength (RSS), and bandwidth to make multi-step predictions for future cycles. The rate of change for each parameter is calculated as a stability evaluation metric, providing a quantitative measure of spectrum stability. To ensure consistency in the evaluation process, the rate of change for each parameter is normalized using Min-Max normalization, and the standardized results are input into the K-Means algorithm for stability clustering of the rate-of-change vectors. By calculating the centroid coordinates of each cluster and their norms, a stability index is derived, forming the stability assessment model. Building upon the Deep Q-Network (DQN), a Gated Recurrent Unit (GRU) is introduced to create a temporal state encoder that captures the temporal dependencies in spectrum data. Additionally, a Dueling Network architecture is employed to decouple the state value and action advantage functions, enabling more accurate estimation of the long-term value of spectrum allocation decisions. The reward function incorporates trade-off coefficients to achieve a reasonable allocation of spectrum resources with different stability levels, ensuring a balance between spectrum utilization and collision probability while meeting the diverse stability requirements of vehicular network applications. The proposed framework is designed to be scalable and adaptable to various vehicular network scenarios, including urban, highway, and rural environments. Results and Discussions Simulation results show that the optimized stepwise prediction algorithm significantly improves performance. In both the training and test sets, the algorithm achieves a Root Mean Squared Error (RMSE) of less than 0.1, with no significant overfitting observed (Fig. 5, Fig. 6). This indicates that the algorithm generalizes well to unseen data, making it suitable for real-world deployment. Additionally, the loss function of the proposed algorithm decreases significantly as the number of iterations increases, converging around 150 iterations (Fig. 7). The prediction accuracy also stabilizes around 150 iterations (Fig. 8), suggesting that the algorithm achieves consistent performance within a reasonable training period. These results demonstrate that the proposed prediction algorithm can deliver high-accuracy multi-step predictions for stability parameters across a sufficient number of channels, providing a solid foundation for spectrum stability assessment. Furthermore, the proposed access algorithm consistently outperforms comparative algorithms in terms of spectrum utilization over 20 iterations, while maintaining lower collision probabilities (Fig. 9, Fig. 10). As the number of iterations increases, the cumulative stability index and throughput of the proposed algorithm steadily improve, exceeding the performance of comparative algorithms at all stages. This demonstrates that the proposed algorithm can meet the diverse requirements of vehicle terminals for channel stability and throughput, while ensuring high spectrum utilization and low collision probability. As the number of vehicle terminals increases, the proposed algorithm exhibits faster convergence compared to other algorithms, confirming its robustness in large-scale scenarios. These findings highlight the potential of the proposed framework to meet the growing demands of next-generation vehicular networks. Conclusions This study proposes an integrated “evaluation-decision-optimization” spectrum management paradigm for CVN. By proposing a multi-dimensional time-series feature-based spectrum stability quantification framework and designing a hybrid deep reinforcement learning architecture incorporating gated mechanisms and dueling networks, the research addresses the critical challenge of balancing spectrum efficiency with stability in dynamic vehicular environments. The development of an interpretable reward function enables intelligent spectrum allocation that adapts to diverse quality-of-service requirements, ensuring that both safety-critical and non-safety-critical applications receive the necessary resources. Experimental results show significant improvements in spectrum utilization, collision probability, and system throughput compared to traditional approaches, while maintaining robust performance in large-scale scenarios. These findings advance the theoretical understanding of spectrum management in CVN and provide a practical framework for implementing adaptive DSA solutions in next-generation intelligent transportation systems. Future research will explore extending the proposed framework to support multi-agent scenarios, where multiple vehicles and infrastructure nodes collaboratively optimize spectrum allocation. Additionally, integrating edge computing and federated learning techniques could further enhance the scalability and efficiency of the framework. The proposed methodology offers a scalable and efficient approach to spectrum resource allocation, paving the way for more reliable and high-performance vehicular communication networks.

Regularized Neural Network-Based Normalized Min-Sum Decoding for LDPC Codes

ZHOU Hua, ZHOU Ming, ZHANG Likang

2025, 47(5): 1486-1493. doi: 10.11999/JEIT240860

[Abstract](350) [FullText HTML] (151) [PDF 2693KB](51)

Abstract:
Objective The application of deep learning in communications has demonstrated significant potential, particularly in Low Density Parity Check (LDPC) code decoding. As a rapidly evolving branch of artificial intelligence, deep learning effectively addresses complex optimization problems, making it suitable for enhancing traditional decoding techniques in modern communication systems. The Neural-network Normalized Min-Sum (NNMS) algorithm has shown improved performance over the Min-Sum (MS) algorithm by incorporating trainable neural network models. However, NNMS decoding assigns independent training weights to each edge in the Tanner graph, leading to excessive training complexity and high storage overhead due to the large number of weight parameters. This significantly increases computational demands, posing challenges for implementation in resource-limited hardware. Moreover, the excessive number of weights leads to overfitting, where the model memorizes training data rather than learning generalizable features, degrading decoding performance on unseen codewords. This issue limits the practical applicability of NNMS-based decoders and necessitates advanced regularization techniques. Therefore, this study explores methods to reduce NNMS decoding complexity, mitigate overfitting, and enhance the decoding performance of LDPC codes. Methods Building on the traditional NNMS decoding algorithm, this paper proposes two partially weight-sharing models: VC-SNNMS (sharing weights for edges from variable nodes to check nodes) and CV-SNNMS (sharing weights for edges from check nodes to variable nodes). These models apply a weight-sharing strategy to specific edge types in the bipartite graph, reducing the number of training weights and computational complexity. To mitigate neural network overfitting caused by the high complexity of NNMS and its variants, a regularization technique is proposed. This leads to the development of the Regularized NNMS (RNNMS), Regularized VC-SNNMS (RVC-SNNMS), and Regularized CV-SNNMS (RCV-SNNMS) algorithms. Regularization refines network parameters by modifying the loss function and gradients, penalizing excessively large weights or redundant features. By reducing model complexity, this approach enhances the generalization ability of the decoding neural network, ensuring robust performance on both training and test data. Results and Discussions To evaluate the effectiveness of the proposed algorithms, extensive simulations are conducted under various Signal-to-Noise Ratio (SNR) conditions. The performance is assessed in terms of Bit Error Rate (BER), decoding complexity, and convergence speed. Additionally, a comparative analysis of NNMS, SNNMS, VC-SNNMS, and CV-SNNMS, and their regularized variants systematically examines the effects of weight-sharing and regularization on neural network-based decoding. Simulation results show that for an LDPC code with a block length of 576 and a code rate of 0.75, when BER = 10^–6, the RNNMS, RVC-SNNMS, and RCV-SNNMS algorithms achieve SNR gains of 0.18 dB, 0.22 dB, and 0.27 dB, respectively, compared to their corresponding NNMS, VC-SNNMS, and CV-SNNMS algorithms. Notably, the RVC-SNNMS algorithm demonstrates the best performance, with SNR gains of 0.55 dB, 0.51 dB, and 0.22 dB compared to the BP, NNMS, and SNNMS algorithms, respectively (Fig. 3). Furthermore, under different numbers of decoding iterations, the RVC-SNNMS algorithm consistently outperforms the others in BER performance. Specifically, at BER = 10^–6 with 15 decoding iterations, it achieves SNR gains of 0.57 dB and 0.1 dB compared to the NNMS and SNNMS algorithms, respectively (Fig. 4). Similarly, for an LDPC code with a block length of 1056, when BER = 10^–5 and 10 decoding iterations are used, the RVC-SNNMS algorithm attains SNR gains of 0.34 dB and 0.08 dB compared to the NNMS and SNNMS algorithms, respectively (Fig. 5). Conclusions This study investigates the performance of NNMS and SNNMS for LDPC code decoding and proposes two partially weight-sharing algorithms, VC-SNNMS and CV-SNNMS. Simulation results show that weight-sharing strategies effectively reduce training complexity while maintaining competitive BER performance. To address the overfitting issue associated with the high complexity of NNMS-based algorithms, regularization is incorporated, leading to the development of RNNMS, RVC-SNNMS, and RCV-SNNMS. Regularization effectively mitigates overfitting, enhances network generalization, and improves error-correcting performance for various LDPC codes. Simulation results indicate that the RVC-SNNMS algorithm achieves the best decoding performance due to its reduced complexity and the improved generalization provided by regularization.

Collaborative Interference Resource Allocation Method Based on Improved Secretary Bird Algorithm

LI Yibing, SUN Liuqing, QI Changlong

2025, 47(5): 1494-1504. doi: 10.11999/JEIT240709

[Abstract](252) [FullText HTML] (143) [PDF 2875KB](45)

Abstract:
Objective In the complex electromagnetic environment of Networked Radars (NR), efficiently utilizing limited interference resources to reduce enemy detection capabilities and support successful penetration remains a critical challenge. Existing heuristic algorithms, while partially effective, do not jointly optimize interference patterns, beams, and power resources in multi-beam systems, limiting their applicability in penetration scenarios. To address this limitation, this study proposes an interference resource allocation strategy based on the Improved Secretary Bird Optimization Algorithm (ISBOA). The proposed strategy minimizes detection probability by integrating Cauchy mutation and global collaborative control, enabling the joint optimization of interference patterns, beams, and power across multiple jammers. This approach ensures rational resource allocation, enhances search capability, and improves convergence accuracy, thereby meeting the demands of penetration scenarios. The findings provide a novel solution for interference resource allocation in multi-beam systems against NR. Methods This study models the complex interference resource allocation problem as a multi-constrained nonlinear mixed-integer programming problem and addresses it using an improved intelligent optimization algorithm. A mixed-integer programming model incorporating interference patterns, beams, and power resources is established, with the detection and fusion probability of networked radar as the performance evaluation metric. The model accounts for the dynamic interactions between radars and jammers, as well as the pulse compression gains of various interference patterns. To overcome the limitations of the traditional Secretary Bird Optimization Algorithm (SBOA) in handling discrete variables and complex constraints, the study integrates Cauchy mutation and global collaborative control strategies. Cauchy mutation leverages its long-tail characteristics to enhance the algorithm’s global search capability, reducing the risk of convergence to local optima. The global collaborative control strategy incorporates penalty factors to ensure compliance with multi-variable constraints, enabling the simultaneous optimization of discrete and continuous variables. Results and Discussions This study presents an innovative interference resource allocation method for multi-beam jamming systems targeting networked radar, leveraging the ISBOA. By integrating Cauchy mutation and global cooperative control strategies, ISBOA significantly enhances optimization performance. Simulation results indicate that ISBOA outperforms other algorithms, including the original SBOA, Harris Hawks Optimizer (HHO), and Sparrow Search Algorithm (SSA). In a scenario with six jammers and eight radars, ISBOA achieved an optimal function value of 0.6095, which is notably lower than 0.8158 (SBOA), 1.2666 (HHO), and 1.3679 (SSA) (Fig. 4). Moreover, ISBOA demonstrated faster convergence and greater stability across 50 independent experiments, yielding an average optimal function value of 0.6892 (Fig. 5) and a convergence error of 0.1449 (Fig. 6). ISBOA’s joint optimization of interference patterns, beams, and power resources enables more efficient allocation of jamming resources and reduces the detection probability of networked radar. This advantage is further validated across various scenarios, where ISBOA consistently outperformed other algorithms in solution quality and computational efficiency (Fig. 8). The experimental results highlight ISBOA’s robustness and adaptability, demonstrating its potential for application in complex battlefield environments. Conclusions This study proposes an optimization method for interference resource allocation in multi-beam jamming systems targeting networked radar scenarios, utilizing ISBOA. A mixed-integer programming model integrating interference patterns, beams, and power resources is developed. ISBOA incorporates Cauchy mutation and global cooperative control strategies to enhance global search capability and stability. Simulation results demonstrate that ISBOA outperforms the original SBOA, HHO, and SSA in terms of convergence speed and search efficiency. ISBOA exhibits superior stability and enables more rational allocation of interference resources, effectively reducing the detection probability of networked radar. Moreover, ISBOA demonstrates strong adaptability and robustness across various scenarios, providing an effective solution for interference resource allocation in complex battlefield environments.

Joint Design of Transmission Sequences and Receiver Filters Based on the Generalized Cross Ambiguity Function

WEN Cai, WEN Shu, ZHANG Xiang, XIAO Hao, LI Zhangping

2025, 47(5): 1505-1516. doi: 10.11999/JEIT240905

[Abstract](306) [FullText HTML] (181) [PDF 4918KB](86)

Abstract:
Objective A set of orthogonal waveforms with favorable correlation properties enhances target detection and anti-jamming performance in Multiple-Input Multiple-Output (MIMO) radar systems. Jointly designing the transmit sequence set and receive filter bank introduces additional degrees of freedom, reducing auto- and cross-correlation. However, research on their joint design based on the Generalized Cross Ambiguity Function (GCAF) is limited and primarily focuses on reducing the peak sidelobe level. Since a low Integrated Sidelobe Level (ISL) is also critical for radar imaging and target detection, this study formulates the joint design problem with the objective of minimizing the ISL of the GCAF, subject to mainlobe gain and dynamic range constraints. Methods This paper proposes a Maximum Block Improvement–Successive Convex Approximation (MBI-SCA) method for the nonconvex optimization problem involving High-Order Polynomials (HOP). The MBI algorithm decomposes the nonconvex problem into multiple subproblems, which are then solved iteratively using the SCA method. To further reduce computational cost, an Alternating Direction Penalty Method (ADPM) is introduced. This algorithm, which supports parallel implementation, dynamically updates the penalty factor in each iteration, ensuring the penalty term gradually converges to zero. This guarantees algorithm convergence and accelerates the search for a better feasible solution. Results and Discussions The proposed MBI-SCA algorithm converges in approximately 12 iterations, while the MBI-ADPM algorithm achieves faster convergence in about 10 iterations (Fig. 1). The running time of the MBI-ADPM algorithm increases as the sequence length varies from 16 to 512, whereas the MBI-SCA algorithm exhibits an overall increase, with a decrease at

\begin{document}${2^8}$\end{document}

, likely due to an decrease in the number of iterations when the SCA method solves the subproblem (Fig. 2). Both algorithms demonstrate strong performance, with GCAF values in the locally optimized region significantly lower than those in the unoptimized region, all below –200 dB (Fig. 3). However, MBI-ADPM achieves better local optimization, reducing GCAF values to –320 dB, whereas MBI-SCA reaches only –260 dB (Fig. 4). The parameter

\begin{document}$K$\end{document}

determines the optimization interval’s range, and as

\begin{document}$K$\end{document}

increases from 5 to 35, the ISL values of both methods also increase. For MBI-SCA, the optimal range of the parameter

\begin{document}$K$\end{document}

is

\begin{document}$5 \le K \le 15$\end{document}

, where the integral sidelobe levels remain below –50 dB, meeting the low sidelobe requirement. In contrast, MBI-ADPM performs best values of

\begin{document}$K$\end{document}

are 5 and 10, achieving an objective function value close to –300 dB (Fig. 7). Conclusions This paper proposes a joint design method for transmit waveforms and receive filters that minimizes the GCAF ISL under mainlobe gain and dynamic range constraints, addressing the reduction of auto- and cross-correlation integral levels in MIMO radar waveform sets. To solve the quartic nonconvex optimization problem, the original problem is first decomposed into manageable subproblems using the MBI algorithm, which are then solved iteratively with the SCA algorithm. To further reduce computational complexity, the ADPM algorithm is introduced to solve the SCA subproblems. Simulation results demonstrate that the MBI-ADPM algorithm converges faster and achieves a lower ISL than MBI-SCA for shorter distance intervals of interest.

Error State Kalman Filter Multimodal Fusion SLAM Based on MICP Closed-loop Detection

CHEN Dan, CHEN Hao, WANG Zichen, ZHANG Heng, WANG Changqing, FAN Lintao

2025, 47(5): 1517-1528. doi: 10.11999/JEIT240980

[Abstract](400) [FullText HTML] (357) [PDF 5069KB](50)

Abstract:
Objective Single-sensor Simultaneous Localization and Mapping (SLAM) technology has limitations, including large mapping errors and poor performance in textureless or low-light environments. The 2D-SLAM laser performs poorly in natural and dynamic outdoor environments and cannot capture object information below the scanning plane of a single-line LiDAR. Additionally, long-term operation leads to cumulative errors from sensor noise and model inaccuracies, significantly affecting positioning accuracy. This study proposes an Error State Kalman Filter (ESKF) multimodal tightly coupled 2D-SLAM algorithm based on LiDAR MICP closed-loop detection to enhance environmental information acquisition, trajectory estimation, and relative pose estimation. The proposed approach improves SLAM accuracy and real-time performance, enabling high-precision and complete environmental mapping in complex real-world scenarios. Methods Firstly, sensor data is spatiotemporally synchronized, and the LiDAR point cloud is denoised. MICP matching closed-loop detection, including initial ICP matching, sub-ICP matching, and key ICP matching, is then applied to optimize point cloud matching. Secondly, an odometer error model and a point cloud matching error model between LiDAR and machine vision are constructed. Multi-sensor data is fused using the ESKF to obtain more accurate pose error values, enabling real-time correction of the robot’s pose. Finally, the proposed MICP-ESKF SLAM algorithm is compared with several classic SLAM methods in terms of closed-loop detection accuracy, processing time, and robot pose accuracy under different data samples and experimentally validated on the Turtlebot2 robot platform. Results and Discussions This study addresses the reduced accuracy of 2D grid maps due to accumulated odometry errors in large-scale mobile robot environments. To overcome the limitations of visual and laser SLAM, the paper examines laser radar Multi-layer Iterative Closest Point (MICP) matching closed-loop detection and proposes a visual-laser odometry tightly coupled SLAM method based on the ESKF. The SLAM algorithm incorporating MICP closed-loop detection achieves higher accuracy than the Cartographer algorithm on the test set. Compared to the Karto algorithm, the proposed MICP-ESKF-SLAM algorithm shows significant improvements in detection accuracy and processing speed. As shown in Table 2, the multimodal MICP-ESKF-SLAM algorithm has the lowest median relative pose error, approximately 3% of that of the Gmapping algorithm. The average relative pose error is reduced by about 40% compared to the MICP-SLAM algorithm, demonstrating the advantages of the proposed approach in high-precision positioning. Furthermore, multi-sensor fusion via ESKF effectively reduces cumulative errors caused by frequency discrepancies and sensor noise, ensuring timely robot pose updates and preventing map drift. Conclusions This study proposes a 2D-SLAM algorithm that integrates MICP matching closed-loop detection with ESKF. By estimating errors, optimizing state updates, and applying corrected increments to the main state, the approach mitigates cumulative drift caused by random noise and internal error propagation in dynamic environments. This enhances localization and map construction accuracy while improving the real-time performance of multi-sensor tightly coupled SLAM. The ESKF multi-sensor tightly coupled SLAM algorithm based on multi-layer ICP matching closed-loop detection is implemented on the Turtlebot2 experimental platform for large-scale scene localization and mapping. Experimental results demonstrate that the proposed algorithm effectively integrates LiDAR and machine vision data, achieving high-accuracy robot pose estimation and stable performance in dynamic environments. It enables the accurate construction of a complete, drift-free environmental map, addressing the challenges of 2D mapping in complex environments that single-sensor SLAM algorithms struggle with, thereby providing a foundation for future research on intelligent mobile robot navigation.

Methods for Enhancing Positioning Reliability in Indoor and Underground Satellite-shielded Environments

YI Qingwu, HUANG Lu, YU Baoguo, LIAO Guisheng

2025, 47(5): 1529-1542. doi: 10.11999/JEIT240870

[Abstract](323) [FullText HTML] (205) [PDF 11708KB](50)

Abstract:
This paper proposes a method to enhance the reliability of indoor positioning by combining an unsupervised autoencoder with nonlinear filtering. A Denoising Variational AutoEncoder model assisted by a deep Convolutional Neural Network (DVAE-CNN) is designed to regulate the positioning results from multiple aspects, including measurement data quality evaluation, target state transition modeling, and weight update strategies aided by environmental prior information. This approach addresses the issue of low positioning reliability caused by information loss, errors, and disturbances in complex indoor environments. Compared to positioning results without the reliability control mechanism, the proposed method improves average positioning accuracy by 74.6% and positioning reliability by 88.2%. Extensive experiments conducted in the venues of the Beijing 2022 Winter Olympics demonstrate that the proposed method provides highly robust, reliable, and continuous positioning services, showing significant potential for practical application and promotion. Objective With the rapid development of indoor positioning technologies, ensuring high reliability and trustworthiness in complex indoor and underground satellite-shielded environments remains a critical challenge. Existing methods often prioritize accuracy and continuity but neglect reliability under environmental disturbances such as signal loss, noise, and multipath effects. To address these limitations, this study proposes a multi-level trustworthiness enhancement framework by integrating an unsupervised Denoising Variational AutoEncoder with a Convolutional Neural Network (DVAE-CNN) and nonlinear particle filtering. The goal is to improve positioning reliability through data quality assessment, environmental prior information fusion, and adaptive state transition constraints, thereby supporting robust location-based services in challenging environments like the 2022 Beijing Winter Olympics venues. Methods The proposed framework combines a DVAE-CNN model for denoising and feature extraction with a particle filtering mechanism incorporating environmental priors and sensor data. The DVAE-CNN evaluates measurement data quality by reconstructing noisy inputs and identifying anomalies through reconstruction probability thresholds. Concurrently, nonlinear particle filtering integrates multi-source heterogeneous data (e.g., inertial sensors, Wi-Fi, and indoor maps) to constrain particle distributions based on motion patterns and structural boundaries. A weight update strategy dynamically adjusts particle importance using prior knowledge, while adaptive step-length estimation refines Pedestrian Dead Reckoning (PDR) to reduce cumulative errors. Results and Discussions Extensive experiments in controlled environments and real-world Olympic venues demonstrate significant improvements. Compared to baseline methods without trustworthiness mechanisms, the proposed approach achieves a 74.6% increase in average positioning accuracy and an 88.2% enhancement in reliability. In dynamic tests at the Beijing Winter Olympics venues, the method eliminated trajectory jumps caused by signal loss and improved coverage continuity by 34%, ensuring seamless navigation in complex indoor spaces. The fusion of DVAE-CNN-based anomaly detection and environmental constraints effectively suppressed “wall-penetrating” particles, enhancing result plausibility. Conclusions This study addresses the critical issue of positioning trustworthiness in indoor and underground environments by integrating data-driven anomaly detection with multi-source fusion. Key contributions include: (1) A DVAE-CNN model that improves data quality assessment and noise resilience; (2) A particle filtering framework leveraging environmental priors and adaptive PDR for robust state estimation; (3) Validation in high-stakes scenarios, achieving sub-meter accuracy and high reliability. Limitations, such as PDR’s cumulative errors, warrant further exploration. Future work will focus on real-time optimization and sensor noise modeling for broader applications.

Global Navigation Satellite System Partial Ambiguity Resolution Method Integrating Ionospheric Delay Correction and Multi-frequency Signal Optimization

ZHANG Xu, YANG Jie

2025, 47(5): 1543-1553. doi: 10.11999/JEIT240682

[Abstract](286) [FullText HTML] (229) [PDF 2426KB](33)

Abstract:
Objective Global Navigation Satellite System (GNSS) high-precision positioning is widely applied due to its accuracy. However, the integrity of the Ambiguity Resolution (AR) process remains limited, particularly in occluded environments and over long baselines. Traditional AR methods are often affected by ionospheric delay errors, which become substantial when the ionospheric conditions differ between reference and rover stations. This paper proposes a Modified Partial Ambiguity Resolution (MPAR) method that integrates ionospheric delay correction models with multi-frequency signal optimization. The combined approach improves GNSS positioning accuracy and reliability under varied environmental and baseline conditions. Methods To reduce the effect of ionospheric delay on AR, this study incorporates an Ionospheric delay correction model into the geometry-free Cascade Integer Resolution (ICIR) method. ICIR resolves the integer ambiguities of Extra-Wide Lane (EWL), Wide Lane (WL), and Narrow Lane (NL) combinations using carrier phase measurements with different wavelengths. The ionospheric delay correction model enables compensation for differential delays between stations, improving AR accuracy, particularly over long baselines. To further enhance data usage—especially in cases of low-quality observations—a two-stage partial AR strategy is employed. In the first stage, the ICIR method is applied to an optimal subset of satellites selected based on tri-frequency availability and high elevation angles. For the non-optimal subset, which may include satellites with limited frequencies or weaker signal quality, the Least-Squares AMBiguity Decorrelation Adjustment (LAMBDA) method is used in geometric mode, with assistance from the ambiguity-fixed results of the optimal subset. This integrated approach reduces computational complexity and improves the AR success rate and reliability. The MPAR method proceeds as follows: (1) select the optimal satellite subset based on frequency availability and elevation angle; (2) apply the ICIR method to resolve ambiguities in this subset; (3) use the fixed ambiguities from the optimal subset to assist in resolving ambiguities for the non-optimal subset via the LAMBDA method; (4) obtain the final integer ambiguity solution for the full epoch. Results and Discussions The proposed MPAR method is validated using two datasets collected under different environments: one from Tokyo, characterized by complex urban occlusion and long baselines (approximately 1 700 meters), and another from Wuhan, featuring an open campus environment and short baselines (approximately 600 meters). The results show that the MPAR method outperforms traditional PAR methods in positioning accuracy, AR success rate, and computational efficiency. As shown in (Fig. 3) and (Fig. 5), satellite visibility in the Tokyo dataset is significantly affected by occlusion, leading to fewer available satellites compared to the Wuhan dataset. Despite these challenges, the MPAR method achieves the highest success rate and the lowest Average Standard Deviation (ASD) in all tested scenarios, including GPS, BDS, and dual-system modes (Table 2 and Table 4). In the Tokyo dataset, the MPAR method reduces the ASD by up to 40% compared to traditional methods, reflecting its robustness in complex environments. The AR success rate also significantly improves with the MPAR method. As presented in (Table 3) and (Table 5), the MPAR method achieves AR success rates exceeding 86% in all tested scenarios, with a peak rate of 99.4% in the GPS/BDS dual-system mode of the Wuhan dataset. These results demonstrate the effectiveness of the proposed method in enhancing AR reliability under challenging conditions. In terms of computational efficiency, the MPAR method exhibits balanced performance. Although the use of the ionospheric delay correction model slightly increases computational complexity, the overall efficiency remains competitive, with an average solution time of approximately 0.13 seconds per epoch (Table 3 and Table 5). This performance supports the suitability of the MPAR method for real-time applications. Furthermore, Table 3 (Tokyo dataset) and Table 5 (Wuhan dataset) summarize the performance metrics of the five AR methods evaluated. The MPARICIR method achieves the highest AR success rates across all systems and environments, reaching 93.1% and 99.4% for the Tokyo and Wuhan datasets, respectively. Notably, the MPARICIR method maintains a high success rate while reducing computation time compared to other methods, indicating its efficiency. These results support the effectiveness and robustness of the proposed MPARICIR method in improving GNSS positioning performance. Conclusions This study proposes an MPAR method for high-precision GNSS positioning. By integrating ionospheric delay correction models with multi-frequency signal optimization, MPAR combines the strengths of geometry-free and geometry-based AR strategies. The method effectively reduces the effect of ionospheric delay, particularly over long baselines and in occluded environments. Experimental results confirm that MPAR improves positioning accuracy, AR success rate, and computational efficiency relative to conventional methods. Its consistent performance across varied environments and baseline lengths highlights its suitability for broad application in high-precision GNSS positioning.

A Cybersecurity Entity Recognition Approach Based on Character Representation Learning and Temporal Boundary Diffusion

HU Ze, LI Wenjun, YANG Hongyu

2025, 47(5): 1554-1568. doi: 10.11999/JEIT240953

[Abstract](654) [FullText HTML] (487) [PDF 3956KB](92)

Abstract:
Objective The vast amount of unstructured cybersecurity information available online holds significant value. Named Entity Recognition (NER) in cybersecurity facilitates the automatic extraction of such information, providing a foundation for cyber threat analysis and knowledge graph construction. However, existing cybersecurity NER research remains limited, primarily relying on general-purpose approaches that struggle to generalize effectively to domain-specific datasets, often resulting in errors when recognizing cybersecurity-specific terms. Some recent studies decompose the NER task into entity boundary detection and entity classification, optimizing these subtasks separately to enhance performance. However, the representation of complex cybersecurity entities often exceeds the capability of single-feature semantic representations, and existing boundary detection methods frequently produce misjudgments. To address these challenges, this study proposes a cybersecurity entity recognition approach based on character representation learning and temporal boundary diffusion. The approach integrates character-level feature extraction with a boundary diffusion network based on a denoising diffusion probabilistic model. By focusing on optimizing entity boundary detection, the proposed method improves performance in cybersecurity NER tasks. Methods The proposed approach divides the NER task into two subtasks: entity boundary detection and entity classification, which are processed independently, as illustrated (Fig. 1). For entity boundary detection, a Question-Answering (QA) framework is adopted. The framework first generates questions about the entities to be extracted, concatenates them with the corresponding input sentences, and encodes them using a pre-trained BERT model to extract preliminary semantic features. Character-level feature extraction is then performed using a Dilated Convolutional Residual Character Network (DCR-CharNet), which processes character-level information through dilated residual blocks. Dilated convolution expands the model’s receptive field, capturing broader contextual information, while a self-attention mechanism dynamically identifies key features. These components enhance the global representation of input data and provide multi-dimensional feature representations. A Temporal Boundary Diffusion Network (TBDN) is then applied for entity boundary detection. TBDN employs a fixed forward diffusion process that introduces Gaussian noise to entity boundaries at each time step, progressively blurring them. A learnable reverse diffusion process subsequently predicts and removes noise at each time step, enabling the gradual recovery of accurate entity boundaries and leading to precise boundary detection. For entity classification, an independent network is trained to assign labels to detected entities. Like boundary detection, this subtask also adopts a QA framework. A cybersecurity-specific pre-trained language model, SecRoBERTa, encodes the concatenated question and input data to extract entity classification features. These features are then processed through a linear-layer-based entity classifier, which outputs the recognized entity type. Results and Discussions The performance of the proposed approach is evaluated on the DNRTI cybersecurity dataset, with comparative results against baseline methods presented (Table 3). The proposed approach achieved a 0.40% improvement in F1-score over UTERMMF, a model incorporating character-level, part-of-speech, and positional features along with inter-word relationship classification. Compared to CTERMRFRAT, which employs an adversarial training framework, the proposed approach improved the F1-score by 1.65%. Additionally, it outperformed BERT+BiLSTM+CRF by 5.20% and achieved gains of 12.21%, 17.90%, and 18.31% over BERT, CNN+BiLSTM+CRF, and IDCNN+CRF, respectively. These results highlight that boundary detection accuracy is a key factor limiting NER performance, and optimizing boundary detection methods can significantly enhance overall model effectiveness. The proposed approach’s emphasis on boundary detection enables more accurate identification of entity boundaries, contributing to higher F1-scores. However, in terms of accuracy, it slightly underperforms CNN+BiLSTM+CRF. This discrepancy is attributed to class imbalance in the dataset, where certain categories are overrepresented while others are underrepresented. The approach demonstrates strong performance in handling minority categories, but its focus on rare entities slightly reduces prediction accuracy for common categories, affecting overall accuracy. Despite this trade-off, the approach enhances entity boundary detection, reducing misidentifications and improving precision and recall, thereby increasing the F1-score. Errors in boundary detection may propagate to the entity classification stage, impacting overall accuracy. However, the proposed two-stage approach, which prioritizes boundary detection optimization, ensures more precise boundary identification, which is crucial for improving NER performance. In terms of computational efficiency, the proposed approach is compared with DiffusionNER (Table 4), another diffusion-based NER model. Results indicate that the proposed approach requires fewer parameters, achieves faster inference speeds, and delivers higher F1-scores under the same hardware and software conditions. Conclusions Enhancing boundary detection efficiency significantly improves NER performance. The proposed approach reduces resource consumption while achieving superior performance compared to recent baseline methods in cybersecurity NER tasks.

Model and Data Dual-driven Joint Limited-Angle CT Reconstruction and Metal Artifact Reduction Method

SHI Baoshun, CHENG Shizhan, JIANG Ke, FU Zhaoran

2025, 47(5): 1569-1581. doi: 10.11999/JEIT240703

[Abstract](406) [FullText HTML] (167) [PDF 6560KB](53)

Abstract:
Objective Computed Tomography (CT) is widely used in medical diagnostics due to its non-destructive, non-contact imaging capabilities. To lower cancer risk from radiation exposure, clinical practice often limits scanning angles, referred to as Limited-Angle CT (LACT). Incomplete projection data in LACT leads to wedge-shaped artifacts in reconstructions using Filtered Back-Projection (FBP) algorithms. These artifacts worsen in the presence of metallic implants. Although LACT reconstruction without metal and full-angle CT Metal Artifact Reduction (MAR) have been extensively studied, the joint task of Limited-Angle and Metal Artifact Reduction (LAMAR) has received limited attention. This study proposes a model- and data-driven CT network that integrates a Task Selection (TS) module to apply appropriate gradient descent steps for different tasks. This enables simultaneous processing of LACT and LAMAR. The network also incorporates dual-domain information interaction during alternating iterations to reconstruct high-quality CT images. Methods First, a dual-domain reconstruction model integrating both model- and data-driven model is constructed to address the joint task of LACT reconstruction and LAMAR. The model comprises four components: an image-domain data fidelity term, a projection-domain data fidelity term, an image-domain regularization term, and a projection-domain regularization term. These terms are solved using an alternating iteration strategy. The image- and projection-domain subproblems are addressed using the proximal gradient descent algorithm, with the iterative process unrolled into a Deep Neural Network (DNN). Each stage of the deep unrolling network includes three components: a TS module, a projection-domain subnetwork, and an image-domain subnetwork. The TS module dynamically determines gradient descent step sizes for the LACT and LAMAR tasks by comparing image-domain FBP reconstruction results with predefined thresholds. The projection-domain subnetwork is shared by both tasks. Finally, the data-driven proximal network comprises the projection-domain and image-domain subnetworks. The projection-domain subnetwork includes an encoder, a dual-branch structure, and a decoder. The encoder has two stages, each consisting of a convolutional layer followed by an activation function; the decoder mirrors this architecture. A Transformer-based long-range branch incorporates non-metal trace information into a self-attention mechanism to guide correction of metal trace data using contextual information from non-metal regions. A short-range branch, composed of six residual blocks, extracts local features. The outputs of the two branches are fused using a weighted strategy before being passed to the decoder. The image-domain subnetwork is implemented as an attention-based U-Net. Channel and spatial attention mechanisms are applied before each of the four downsampling operations in the U-Net encoder. This design allows the decoder to more effectively leverage encoded information for high-quality CT image reconstruction without increasing the number of network parameters. Results and Discussions Experimental results on both LACT reconstruction and LAMAR tasks show that the proposed method outperforms existing CT reconstruction algorithms in both qualitative and quantitative evaluations. Quantitative comparisons (Table 1) indicate that the proposed method achieves higher average Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and lower Root Mean Square Error (RMSE) for both tasks across three angular ranges. Specifically, average PSNR improvements for the LAMAR and LACT tasks reach 2.78 dB, 2.88 dB, and 2.32 dB, respectively, compared with the best-performing baseline methods. Qualitative comparisons (Fig 4 and Fig 5) show that reconstructing CT images and projection data through alternating iterations, combined with dual-domain information interaction, enables the network to effectively suppress composite artifacts and improve the reconstruction of soft tissue regions and fine structural details. These results consistently exceed those of existing approaches. Visual assessment of reconstruction performance on the clinical dataset for the LAMAR task (Fig 6) further demonstrates the method’s effectiveness in reducing metal artifacts around implants. The reconstructed images exhibit clearer structural boundaries and improved tissue visibility, indicating strong generalization to previously unseen clinical data. Conclusions To address the combined task of LACT reconstruction and LAMAR, this study proposes a dual-domain, model- and data-driven reconstruction framework. The optimization problem is solved using an alternating iteration strategy and unfolded into a model-driven CT reconstruction network, with each subnetwork trained in a data-driven manner. In the projection-domain network, a TS module identifies the presence of metallic implants in the initial CT estimates, allowing a single model to simultaneously handle cases with and without metal. A trace-aware projection-domain proximal subnetwork, integrating Transformer and convolutional neural network architectures, is designed to capture both local and non-local contextual features for restoring metal-corrupted regions. In the image-domain network, a U-Net architecture enhanced with channel and spatial attention mechanisms is used to maximize spatial feature utilization and improve reconstruction quality. Experimental results on the AAPM and DeepLesion datasets confirm that the proposed method consistently outperforms existing algorithms under various limited-angle conditions and in the presence of metal artifacts. Further evaluation on the SpineWeb dataset demonstrates the network’s generalization capability across clinical scenarios.

A Medical Video Segmentation Algorithm Integrating Neighborhood Attention and State Space Model

DING Jianrui, ZHANG Ting, LIU Jiadong, NING Chunping

2025, 47(5): 1582-1595. doi: 10.11999/JEIT240755

[Abstract](513) [FullText HTML] (380) [PDF 6078KB](104)

Abstract:
Objective Accurate segmentation of lesions in medical videos is crucial for clinical diagnosis and treatment. Unlike static medical images, videos provide continuous temporal information, enabling tracking of lesion evolution and morphological changes. However, existing segmentation methods primarily focus on processing individual frames, failing to effectively capture temporal correlations across frames. While self-attention mechanisms have been used to model long-range dependencies, their quadratic computational complexity renders them inefficient for high-resolution video segmentation. Additionally, medical videos are often affected by motion blur, noise, and illumination variations, which further hinder segmentation accuracy. To address these challenges, this paper proposes a novel medical video segmentation algorithm that integrates neighborhood attention and a State Space Model (SSM). The approach aims to efficiently capture both local and global spatiotemporal features, improving segmentation accuracy while maintaining computational efficiency. Methods The proposed approach comprises two key stages: local feature extraction and global temporal modeling, designed to efficiently capture both spatial and temporal dependencies in medical video segmentation.In the first stage, a deep convolutional network is used to extract spatial features from each video frame, providing a detailed representation of anatomical structures. However, relying solely on spatial features is insufficient for medical video segmentation, as lesions often undergo subtle morphological changes over time. To address this, a neighborhood attention mechanism is introduced to capture short-term dependencies between adjacent frames. Unlike conventional self-attention mechanisms, which compute relationships across the entire frame, neighborhood attention selectively attends to local regions around each pixel, reducing computational complexity while preserving essential temporal coherence. This localized attention mechanism enables the model to focus on small but critical changes in lesion appearance, making it more robust to motion and deformation variations. In the second stage, an SSM module is integrated to capture long-range dependencies across the video sequence. Unlike Transformer-based approaches, which suffer from quadratic complexity due to the self-attention mechanism, the SSM operates with linear complexity, significantly improving computational efficiency while maintaining strong temporal modeling capabilities. To further enhance the processing of video-based medical data, a 2D selective scanning mechanism is introduced to extend the SSM from 1D to 2D. This mechanism enables the model to extract spatiotemporal relationships more effectively by scanning input data along multiple directions and merging the results, ensuring that both local and global temporal structures are well represented. The combination of neighborhood attention for local refinement and SSM-based modeling for long-range dependencies enables the proposed method to achieve a balance between segmentation accuracy and computational efficiency. The model is trained and evaluated on multiple medical video datasets to verify its effectiveness across different segmentation scenarios, demonstrating its capability to handle complex lesion appearances, background noise, and variations in imaging conditions. Results and Discussions The proposed method is evaluated on three widely used medical video datasets: thyroid ultrasound, CVC-ClinicDB, and CVC-ColonDB. The model achieves Intersection Over Union (IOU) scores of 72.7%, 82.3%, and 72.5%, respectively, outperforming existing state-of-the-art methods. Compared to the Vivim model, the proposed method improves IOU by 5.7%, 1.7%, and 5.5%, highlighting the advantage of leveraging temporal information. In terms of computational efficiency, the model achieves 23.97 frames per second (fps) on the thyroid ultrasound dataset, making it suitable for real-time clinical applications. A comparative analysis against several state-of-the-art methods, including UNet, TransUNet, PraNet, U-Mamba, LKM-UNET, RMFG, SALI, and Vivim, demonstrates that the proposed method consistently outperforms these approaches, particularly in complex scenarios with significant background noise, occlusions, and motion artifacts. Specifically, on the CVC-ClinicDB dataset, the proposed model achieves an IOU of 82.3%, exceeding the previous best approach (80.9%). On the CVC-ColonDB dataset, which presents additional challenges due to lighting variations and occlusions, the model attains an IOU of 72.5%, outperforming the previous best method (70.8%). These results highlight the importance of incorporating both local and global temporal information to enhance segmentation accuracy and robustness in medical video analysis. Conclusions This study proposes a medical video segmentation algorithm that integrates neighborhood attention and an SSM to capture both local and global spatiotemporal features. This integration enables an effective balance between segmentation accuracy and computational efficiency. Experimental results demonstrate the superiority of the proposed method over existing approaches across multiple medical video datasets. The main contributions include: the combined use of neighborhood attention and SSM for efficient spatiotemporal feature extraction; a 2D selective scanning mechanism that extends SSMs for video-based medical segmentation; improved segmentation performance exceeding that of state-of-the-art models while maintaining real-time processing capability; and enhanced robustness to background noise and lighting variations, improving reliability in clinical applications. Future work will focus on incorporating prior knowledge and anatomical constraints to refine segmentation accuracy in cases with ambiguous lesion boundaries; developing advanced boundary refinement strategies for challenging scenarios; extending the framework to multi-modal imaging data such as CT and MRI videos; and optimizing the model for deployment on edge devices to support real-time processing in point-of-care and mobile healthcare settings.

Left Atrial Scar Segmentation Method Combining Cross-Modal Feature Excitation and Dual Branch Cross Attention Fusion

RUAN Dongsheng, SHI Zhebin, WANG Jiahui, LI Yang, JIANG Mingfeng

2025, 47(5): 1596-1608. doi: 10.11999/JEIT240775

[Abstract](294) [FullText HTML] (198) [PDF 5754KB](43)

Abstract:
Objective Atrial Fibrillation (AF) is a common arrhythmia associated with increased mortality. The distribution and extent of left atrial fibrosis are critical for predicting the onset and persistence of AF, as fibrotic tissue alters cardiac electrical conduction. Accurate segmentation of left atrial scars is essential for identifying fibrotic lesions and informing clinical diagnosis and treatment. However, this task remains challenging due to the irregular morphology, sparse distribution, and small size of scars. Deep learning models often perform poorly in scar feature extraction owing to limited supervision of atrial boundary information, which results in detail loss and reduced segmentation accuracy. Although increasing dataset size can improve performance, medical image acquisition is costly. To address this, the present study integrates prior knowledge that scars are generally located on the atrial wall to enhance feature extraction while reducing reliance on large labeled datasets. Two boundary feature enhancement modules are proposed. The Cross-Modal feature Excitation (CME) module encodes atrial boundary features to guide the network’s attention to atrial structures. The Dual-Branch Cross-Attention (DBCA) fusion module combines Magnetic Resonance Imaging (MRI) and boundary features at a deeper level to enhance boundary scar representation and improve segmentation accuracy. Methods This study proposes an enhanced U-shaped encoder–decoder framework for left atrial scar segmentation, incorporating two modules: the CME module and the DBCA module. These modules are embedded within the encoder to strengthen attention on atrial boundary features and improve segmentation accuracy. First, left atrial cavity segmentation is performed on cardiac MRI using a pre-trained model to obtain a binary mask. This binary map undergoes dilation and erosion to generate a Signed Distance Map (SDM), which is then used together with the MRI as input to the model. The SDM serves as an auxiliary representation that introduces boundary constraints. The CME module, integrated within the encoder’s convolutional blocks, applies channel and spatial attention mechanisms to both MRI and SDM features, thereby enhancing boundary information and guiding attention to scar regions. To further reinforce boundary features at the semantic level, the DBCA module is positioned at the bottleneck layer. This module employs a two-branch cross-attention mechanism to facilitate deep interaction and fusion of MRI and boundary features. The bidirectional cross-attention enables SDM and MRI features to exchange cross-modal information, reducing feature heterogeneity and generating semantically enriched and robust boundary fusion features. A combined Dice and cross-entropy loss function is used during training to improve segmentation precision and scar region identification. Results and Discussions This study uses a dataset of 60 left atrial scar segmentations from the LAScarQS 2022 Task 1. The dataset is randomly divided into 48 training and 12 test cases. Several medical image segmentation models, including U-Net, nnUNet, and TransUNet, are evaluated. Results show that three-dimensional segmentation consistently outperforms two-dimensional approaches. The proposed method exceeds the baseline nnUNet, with a 4.14% improvement in Dice score and a 6.37% increase in accuracy (Table 1). Visual assessments confirm improved sensitivity to small scar regions and enhanced attention to boundaries (Fig. 6, Fig. 7). To assess model performance, comparative and ablation experiments are conducted. These include evaluations of encoder configurations (shared vs independent), feature fusion strategies (CME, DBCA, and CBAM), and fusion weight parameters α and β. An independent encoder incorporating both CME and DBCA modules achieves the highest performance (Table 3), with the optimal weight configuration at α = 0.7 and β = 0.3 (Table 5). The effect of different left atrial border widths 2.5 mm, 5.0 mm, and 7.5 mm is also analyzed. A 5.0 mm width provides the best segmentation results, whereas 7.5 mm may extend beyond the relevant region and reduce accuracy (Table 6). Conclusions This study integrates the proposed CME and DBCA modules into the nnUNet framework to address detail loss and feature extraction limitations in left atrial scar segmentation. The findings indicate that: (1) The CME module enhances MRI feature representation by incorporating left atrial boundary information across spatial and channel dimensions, improving the model’s focus on scar regions; (2) The DBCA module enables effective learning and fusion of boundary and MRI features, further improving segmentation accuracy; (3) The proposed model outperforms existing medical image segmentation methods on the LAScarQS2022 dataset, achieving a 4.14% increase in Dice score and a 6.37% gain in accuracy compared to the baseline nnUNet. Despite these improvements, current deep learning models remain limited in their sensitivity to small and poorly defined scars, which often results in segmentation omissions. Challenges persist due to the limited dataset size and the relatively small proportion of scar tissue within each image. These factors constrain the training process and model generalizability. Future work should focus on optimizing scar segmentation under small-sample conditions and addressing sample imbalance to improve overall performance.

2025 Vol. 47, No. 5