A Polymorphic Network Backend Compiler for Domestic Switching Chips
-
摘要: 当前,后端编译器相关工作主要针对软件可编程交换机(BMv2)、现场可编程门阵列(FPGA)、Intel Tofino系列芯片等可编程设备进行设计和优化,不适用于国产盛科TsingMa.MX交换芯片上多模态网络程序的编译。为此,该文提出面向TsingMa.MX交换芯片的多模态网络后端编译器p4c-TsingMa,实现了高级网络编程语言到TsingMa.MX交换芯片的编译,使TsingMa.MX交换芯片同时支持多种网络模态报文的解析与转发。p4c-TsingMa首先使用先序遍历方法从中间表示中提取出协议类型、协议字段、动作等关键信息,然后根据所提取的信息进行指令转译,最终生成TsingMa.MX芯片控制命令。同时,p4c-TsingMa采用用户自定义字段(UDF)合并方法,将不同网络模态的匹配指令合并在1个查找表中,从而1次提取多个模态的匹配项,提高芯片资源利用率。实验结果表明,p4c-TsingMa可实现对多种网络模态程序的正确编译,相较于未启用 UDF 表项合并算法、单端口独立配置各模态UDF规则的场景,其可将寄存器资源利用率提升37.5%~75%。Abstract:
Objective The P4 language and programmable switching chips offer a feasible approach for deploying polymorphic networks. However, polymorphic network packets written in P4 cannot be directly executed on the domestically produced TsingMa.MX programmable switching chip developed by Centec, which necessitates the design of a specialized compiler to translate and deploy the P4 language on this chip. Existing backend compilers are mainly designed and optimized for software-programmable switches such as BMv2, FPGAs, and Intel Tofino series chips, rendering them unsuitable for compiling polymorphic network programs for the TsingMa.MX chip. To resolve this limitation, a backend compiler named p4c-TsingMa is proposed for the TsingMa.MX switching chip. This compiler enables the translation of high-level network programming languages into executable formats for the TsingMa.MX chip, thereby supporting the concurrent parsing and forwarding of multiple network modal packets. Methods p4c-TsingMa first employs a preorder traversal approach to extract key information, including protocol types, protocol fields, and actions, from the Intermediate Representation (IR). It then performs instruction translation to generate corresponding control commands for the TsingMa.MX chip. Additionally, p4c-TsingMa adopts a User Defined Field (UDF) entry merging method to consolidate matching instructions from different network modalities into a unified lookup table. This design enables the extraction of multiple modal matching entries in a single operation, thereby enhancing chip resource utilization. Results and Discussions The p4c-TsingMa compiler is implemented in C++, mapping network modal programs written in the P4 language into configuration instructions for the TsingMa.MX switching chip. A polymorphic network packet testing environment ( Fig. 7 ) is established, where multiple types of network data packets are simultaneously transmitted to the same switch port. According to the configured flow tables, the chip successfully identifies polymorphic network data packets and forwards them to their corresponding ports (Fig. 9 ). Additionally, the table entry merging algorithm improves register resource utilization by 37.5% to 75%, enabling the chip to process more than two types of modal data packets concurrently.Conclusions A polymorphic network backend compiler, p4c-TsingMa, is designed specifically for domestic switching chips. By utilizing the FlexParser and FlexEdit functions of the TsingMa chip, the compiler translates polymorphic network programs into executable commands for the TsingMa.MX chip, enabling the chip to parse and modify polymorphic data packets. Experimental results demonstrate that p4c-TsingMa achieves high compilation efficiency and improves register resource utilization by 37.5% to 75%. -
表 1 不同编译方法对比
目标平台 编译方法 IR解析策略 硬件映射逻辑 核心优化方向 BMv2 P4LLVM[16] 利用底层虚拟机框架解析P4代码,
提取可通过底层虚拟机优化通道
处理的网络程序逻辑生成JSON格式输出,借助底层虚拟机的
通用优化能力提升网络程序效率为软件交换机bmv2提供专用后端编译支持 FPGA P4FPGA[17] 解析P4代码并提取适合FPGA硬件实现的并行、高性能逻辑 生成Verilog代码,简化FPGA开发,融合P4的灵活性与FPGA的高性能 借助底层虚拟机优化通道提升网络程序执行效率 Tofino Chipmunk[18] 解析P4代码并进行多层级编译处理,抽象掉Tofino硬件特定细节 生成Tofino专用文件,提供高层级编程抽象,适配Tofino的匹配 - 动作流水线架构 简化FPGA开发,实现P4与FPGA的高效融合 TsingMa p4c-TsingMa
(本文方法)通过先序遍历解析DAG结构的中间表示,精准提取协议类型等关键信息 自定义P4 语义-国产芯片 FlexParser / FlexEdit硬件参数”映射规则,生成适配的芯片指令,解决非原生P4支持的硬件适配问题 提供高层级编程抽象,屏蔽Tofino硬件细节 1 协议头字段偏移量计算算法
输入:$ \mathrm{hdr} $-P4协议头类型定义(Type_Header对象) 输出:更新全局映射$ \mathrm{structOffsets} $,存储协议头字段的元数据 (1) 初始化 $ \mathrm{structInfo} $← $ \mathit{\varnothing} $ (2) $ {\mathrm{headerName}} $←$ \mathrm{hdr} $的名字 (3) 创建空的headerOffset字典 (4) offset← 0 (5) for each 字段 $ {\mathrm{ele}} $ in $ {\mathrm{hdr}} $ 的字段列表 do : (6) $ {\mathrm{structInfo}}.{\mathrm{eleName}} $←$ {\mathrm{ele.name }}$ (7) if $ {\mathrm{ele}} $的类型是 $ {\mathrm{Type}}\_{\mathrm{Bits}} $ then (8) $ {\mathrm{structInfo}}.{\mathrm{eleType}},{\mathrm{length}} $← "Type_Bits",ele的位宽 (9) $ {\mathrm{offset}} $←$ {\mathrm{offset}} $+$ {\mathrm{ele}} $的位宽 (10) end if (11) 将($ {\mathrm{ele}} $的名称,$ {\mathrm{structInfo}} $) 插入 $ {\mathrm{headerOffset}} $ 字典 (12) end for (13) $ {\mathrm{structInfo.eleName,length,eleOffset}} $←
"totalOffset",offset,offset(14) 将 ("totalOffset",$ {\mathrm{structInfo}} $) 插入 $ {\mathrm{headerOffset}} $字典 (15) 将 ($ {\mathrm{headerName}} $, $ {\mathrm{headerOffset}} $) 插入全局
$ {\mathrm{structOffsets}} $字典2 UDF表项合并算法
(1) 对ruleSet中的UDF规则根据关键字长度进行降序排序 (2) 设置mergeNum个模态分组,每组初始化为$\varnothing $ (3) for each UDF规则in ruleSet: (4) 找到消耗存储资源最少的模态分组,记为$ {\mathrm{G}}\_{\mathrm{min}} $ (5) $ {\mathrm{G}}\_{\mathrm{min}} \leftarrow {\mathrm{G}}\_{\mathrm{min}} \cup ${UDF规则} (6) end for (7) for each 模态分组: (8) 构建合并的UDF规则mergeRule (9) $ {\mathrm{mergeRuleSet}} \leftarrow {\mathrm{mergeRuleSet}} \cup {\mathrm{mergeRule}} $ (10) end for (11) return mergeRuleSet 表 2 IPv4模态CLI指令
ID CLI 说明 1 config udf_1 L2 96 0X0800 0XFFFF etherType通过udf_1这条规则匹配,基础偏移类型L2 偏移96位,掩码0XFFFF,匹配0X0800 2 config udf_2 L2 240 0X0A000003 0XFFFFFFFF dst_addr通过udf_2这条规则匹配,基础偏移类型L2 偏移240位,掩码0XFFFFFFFF,匹配0X0A000003 3 config udf match udf_1 udf_2 action Ethernet2 将匹配到的数据包从Ethernet2端口转发 表 3 芯片资源占用情况
网络模态 关键字
个数关键字
长度(bit)寄存器资源
利用率(%)AccMF 3 80 62.5 CoreMF 2 32 25 IPv4 2 48 37.5 PwL 2 64 50 IPv6 2 144 100 AccMF_CoreMF_IPv4_PwL 6 176 100 -
[1] 凃化清, 廖君虎, 朱俊, 等. 多模态网络环境下网络模态共存与优化部署方法[J]. 电子学报, 2025, 53(5): 1650–1660. doi: 10.12263/DZXB.20250015.TU Huaqing, LIAO Junhu, ZHU Jun, et al. Network modal coexistence and optimal deployment method in polymorphic network environment[J]. Acta Electronica Sinica, 2025, 53(5): 1650–1660. doi: 10.12263/DZXB.20250015. [2] WU Jiangxing, LI Junfei, SUN Penghao, et al. Theoretical framework for a polymorphic network environment[J]. Engineering, 2024, 39: 222–234. doi: 10.1016/j.eng.2024.01.018. [3] 邬江兴, 胡宇翔. 网络技术体系与支撑环境分离的发展范式[J]. 信息通信技术与政策, 2021, 47(8): 1–11. doi: 10.12267/j.issn.2096-5931.2021.08.001.WU Jiangxing and HU Yuxiang. The development paradigm of separation between network technical system and supporting environment[J]. Information and Communications Technology and Policy, 2021, 47(8): 1–11. doi: 10.12267/j.issn.2096-5931.2021.08.001. [4] HU Yuxiang, LI Dan, SUN Penghao, et al. Polymorphic smart network: An open, flexible and universal architecture for future heterogeneous networks[J]. IEEE Transactions on Network Science and Engineering, 2020, 7(4): 2515–2525. doi: 10.1109/tnse.2020.3006249. [5] BOSSHART P, DALY D, GIBB G, et al. P4: Programming protocol-independent packet processors[J]. ACM SIGCOMM Computer Communication Review, 2014, 44(3): 87–95. doi: 10.1145/2656877.2656890. [6] Tofino[EB/OL]. https://www.barefootnetworks.com/products/brief-tofino, 2025. [7] Centec. CTC8180[EB/OL]. https://www.centec.com/silicon/26, 2025. [8] HAUSER F, HÄBERLE M, MERLING D, et al. A survey on data plane programming with P4: Fundamentals, advances, and applied research[J]. Journal of Network and Computer Applications, 2023, 212: 103561. doi: 10.1016/j.jnca.2022.103561. [9] YANG Y Fifan, HE Lin, ZHOU Jiasheng, et al. P4runpro: Enabling runtime programmability for RMT programmable switches[C]. The ACM SIGCOMM 2024 Conference, Sydney, Australia, 2024: 921–937. doi: 10.1145/3651890.3672230. [10] CHEN Zhikang, FENG Yong, LIU Shuxin, et al. OptimusPrime: Unleash dataplane programmability through a transformable architecture[C]. The ACM SIGCOMM 2024 Conference, Sydney, Australia, 2024: 904–920. doi: 10.1145/3651890.3672214. [11] WANG Tao, YANG Xiangrui, ANTICHI G, et al. Isolation mechanisms for high-speed packet-processing pipelines[C]. The 19th USENIX Symposium on Networked Systems Design and Implementation, Renton, USA, 2022: 1289–1305. [12] LU Chenyunfei, TANG Zhu, PENG Wei, et al. Running P4 programs on general programmable network interconnection chips[C]. 2023 Fourth International Conference on Frontiers of Computers and Communication Engineering, Xiamen, China, 2023: 1–6. doi: 10.1109/fcce58525.2023.00008. [13] DA ROBIN D and KHAN J I. An open-source P416 compiler backend for reconfigurable match-action table switches: Making networking innovation accessible[J]. Computer Networks, 2024, 242: 110246. doi: 10.1016/j.comnet.2024.110246. [14] GAO Jiaqi, ZHAI Ennan, LIU H H, et al. Lyra: A cross-platform language and compiler for data plane programming on heterogeneous ASICs[C]. The Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication, 2020: 435–450. doi: 10.1145/3387514.3405879. [15] LI Yifan, GAO Jiaqi, ZHAI Ennan, et al. Cetus: Releasing P4 programmers from the chore of trial and error compiling[C].The 19th USENIX Symposium on Networked Systems Design and Implementation, Renton, USA, 2022: 371–385. [16] DANGETI T K, KEERTHY S V, and UPADRASTA R. P4LLVM: An LLVM based P4 compiler[C]. The 26th International Conference on Network Protocols, Cambridge, UK, 2018: 424–429. doi: 10.1109/icnp.2018.00059. [17] WANG Han, SOULÉ R, DANG H T, et al. P4FPGA: A rapid prototyping framework for P4[C]. The Symposium on SDN Research, Santa Clara, USA, 2017: 122–135. doi: 10.1145/3050220.3050234. [18] GAO Xiangyu, KIM T, WONG M D, et al. Switch code generation using program synthesis[C]. The Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication, 2020: 44–61. doi: 10.1145/3387514.3405852. [19] p4c, a reference compiler for P4 programming language[EB/OL]. https://github.com/p4lang/p4c, 2025. [20] VENKATAKEERTHY S, AGGARWAL R, JAIN S, et al. IR2VEC: LLVM IR based scalable program embeddings[J]. ACM Transactions on Architecture and Code Optimization (TACO), 2020, 17(4): 32. doi: 10.1145/3418463. [21] HARKOUS H, PAPAGIANNI C, DE SCHEPPER K, et al. Virtual queues for P4: A poor man’s programmable traffic manager[J]. IEEE Transactions on Network and Service Management, 2021, 18(3): 2860–2872. doi: 10.1109/tnsm.2021.3077051. [22] VENKATAKEERTHY S, ANDALURI Y, DEY S, et al. Packet processing algorithm identification using program embeddings[C]. The 6th Asia-Pacific Workshop on Networking, Fuzhou, China, 2022: 76–82. doi: 10.1145/3542637.3542649. -
下载:
下载: