融合多尺度频域适配器和双路注意力的时序预测

杨真真; 徐奕; 万成业; 杨永鹏

doi:10.11999/JEIT251188

融合多尺度频域适配器和双路注意力的时序预测

doi: 10.11999/JEIT251188 cstr: 32379.14.JEIT251188

1.
南京邮电大学理学院南京 210003
2.
南京信息职业技术学院网络与通信学院南京 210003

基金项目: 国家自然科学基金(62571269)，江苏省研究生科研与实践创新计划(KYCX_241125, SJCX_240279)

详细信息

作者简介:
杨真真：女，副教授、博士，研究方向为深度学习及其应用等

徐奕：女，硕士生，研究方向为深度学习、时序预测

万成业：男，硕士生，研究方向为深度学习、时序预测

杨永鹏：男，讲师、博士，研究方向为深度学习及其应用等

通讯作者:
杨真真　yangzz@njupt.edu.cn

中图分类号: TP391
计量
- 文章访问数: 200
- HTML全文浏览量: 106
- PDF下载量: 28
- 被引次数: 0
出版历程
- 收稿日期: 2025-11-11
- 修回日期: 2026-01-12
- 录用日期: 2026-01-13
- 网络出版日期: 2026-03-06

Multi-scale Frequency Adapter and Dual-path Attention for Time Series Forecasting

1.
College of Science, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
2.
School of Network and Communication, Nanjing Vocational College of Information Technology, Nanjing, 210023, China

Funds: The National Natural Science Foundation of China (62571269), The Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX24_1125, SJCX24_0279)

摘要

摘要: 现有的主流时序预测方法在多尺度建模与频域特征提取方面，难以协同应对数据中复杂的周期性模式与局部动态变化，导致无法充分捕获关键时序特性。针对此问题，该文提出一种基于多尺度频域适配器和双路注意力(MFADA)的时序预测方法。该方法采用多尺度频域适配器(MFA)自适应提取时序数据的关键频率成分，获得其全局周期性先验。此外，还通过多尺度双路注意力(MDA)机制，将频域先验嵌入时序与特征两条路径，实现跨粒度的动态协同建模，以增强对时序数据复杂演化规律的刻画能力。实验结果表明，所提MFADA在8个公开时序数据集上显著超越现有主流预测方法，在预测精度与计算效率方面均取得优异表现，验证了提出的“频域引导-时域协同”框架的有效性和优越性，为复杂时序任务提供了新思路和解决方案。
- 时序预测 /
- 多尺度 /
- 频域适配器 /
- 双路注意力 /
- 注意力机制
Abstract: Objective With the rapid development of big data technology, time series data are increasingly used in meteorology, power systems, finance, and other fields. However, mainstream forecasting methods face challenges in multi-scale modeling and frequency-domain feature extraction, which limit the ability to capture dynamic properties and periodic patterns in complex datasets. Traditional statistical approaches, such as AutoRegressive Integrated Moving Average (ARIMA), rely on assumptions of linear relationships and therefore perform poorly when applied to nonlinear or high-dimensional time series data. Although deep learning methods, particularly those based on convolutional neural networks and Transformer architectures, improve forecasting accuracy through advanced feature extraction and long-range dependency modeling, limitations remain in efficiently extracting and integrating multi-scale features in both temporal and frequency domains. These limitations reduce stability and forecasting accuracy, especially in dynamic and heterogeneous applications. This study proposes an intelligent forecasting framework that models multi-scale information and improves prediction accuracy across different scenarios. Methods A Multi-scale Frequency Adapter and Dual-path Attention (MFADA) framework is proposed for time series forecasting. The framework integrates two key modules: the Multi-scale Frequency Adapter (MFA) and the Multi-scale Dual-path Attention (MDA). The MFA module captures multi-scale frequency features through adaptive pooling and deep convolution operations. This design improves sensitivity to different frequency components and supports modeling of both short-term and long-term dependencies. The MDA module applies a multi-scale attention mechanism to strengthen fine-grained modeling across temporal and feature dimensions. It enables effective extraction and fusion of comprehensive time-domain and frequency-domain information. The framework is designed with computational efficiency to ensure scalability. Experiments on eight public datasets verify the effectiveness and robustness of the proposed method compared with existing time series forecasting approaches. Results and Discussions Extensive experiments were conducted on eight publicly available multivariate datasets, including ECL, Weather, ETT (ETTm1, ETTm2, ETTh1, ETTh2), Solar-Energy, and Traffic. Evaluation metrics include Mean Absolute Error (MAE) and Mean Squared Error (MSE). Model complexity was assessed through parameter count, FLoating Point Operations (FLOPs), and training time. Comparisons were performed with state-of-the-art models, including Fredformer, Peri-midFormer, iTransformer, TFformer, PatchTST, MSGNet, TimesNet, and TCM. Results show that MFADA achieves superior forecasting performance on most datasets and forecasting horizons (Table 1). The model obtains the best average MSE and MAE of 0.163 and 0.261 on ECL, representing decreases of 13.2% and 17.3% compared with TimesNet for forecasting length 96. On the periodic ETTm1 dataset, the average MSE reaches 0.377, which is 5.3% lower than MSGNet. Ablation experiments (Table 2) confirm the contributions of the MFA and MDA modules. Removing MFA or replacing MDA with standard self-attention increases forecasting errors on ECL, Weather, ETTh1, and ETTh2. These results indicate the complementary roles of both modules in modeling complex temporal patterns. Complexity analysis (Fig. 2) shows that MFADA achieves a balanced trade-off among forecasting accuracy, parameter efficiency, and training time, outperforming Fredformer, MSGNet, and TimesNet. Visualization results for ECL and ETTh2 (Fig. 3, Fig. 4) demonstrate that MFADA effectively follows ground-truth trends, captures turning points, and improves prediction accuracy at both global and local levels. Performance on the Traffic dataset is relatively weaker because of strong spatial correlations in the data, which indicates potential directions for future research. Conclusions This paper proposes MFADA, a time series forecasting method that integrates multi-scale frequency adaptation and dual-path attention mechanisms. MFADA presents four main advantages: (1) The MFA module effectively extracts and integrates multi-scale frequency-domain features through pyramid pooling and channel gating, which improves representation across different temporal scales. (2) The MDA module captures multi-scale dependencies in both temporal and feature dimensions, enabling fine-grained dynamic modeling. (3) The architecture maintains computational efficiency through lightweight convolution and pooling operations. (4) Experimental results across eight datasets and multiple forecasting horizons demonstrate strong generalization ability, particularly for multivariate and long-term forecasting tasks. These results show that MFADA improves both accuracy and efficiency in time series forecasting and provides useful directions for research and practical applications. Future work will explore the integration of spatial correlation information to further improve model applicability.
- Time series forecasting /
- Multi-scale /
- Frequency adapter /
- Dual-path attention /
- Attention mechanism

HTML全文

图 1 整体框架

下载: 全尺寸图片幻灯片

图 2 复杂度实验

下载: 全尺寸图片幻灯片

图 3 ECL数据集上预测结果可视化

下载: 全尺寸图片幻灯片

图 4 ETTh2数据集上预测结果可视化

下载: 全尺寸图片幻灯片

表 1 对比实验结果

		MFADA		Fredformer		Peri-midFormer		iTransformer		TFformer		PatchTST		MSGNet		TimesNet		TCM		DLinear
		MSE	MAE	MSE	MAE	MSE	MAE	MSE	MAE	MSE	MAE	MSE	MAE	MSE	MAE	MSE	MAE	MSE	MAE	MSE	MAE
ECL	96	0.139	0.234	0.147	0.241	0.141	0.235	0.148	0.239	0.151	0.251	0.181	0.270	0.165	0.274	0.168	0.272	0.153	0.253	0.197	0.282
	192	0.152	0.248	0.163	0.257	0.157	0.249	0.167	0.258	0.165	0.264	0.188	0.274	0.184	0.292	0.184	0.289	0.171	0.269	0.196	0.285
	336	0.166	0.266	0.180	0.276	0.174	0.267	0.179	0.272	0.180	0.278	0.204	0.293	0.195	0.302	0.198	0.300	0.183	0.283	0.209	0.301
	720	0.196	0.292	0.213	0.302	0.205	0.296	0.211	0.300	0.213	0.302	0.246	0.324	0.231	0.332	0.220	0.320	0.217	0.311	0.245	0.333
	Avg	0.163	0.261	0.176	0.269	0.169	0.262	0.176	0.267	0.177	0.274	0.205	0.29	0.194	0.300	0.192	0.295	0.181	0.279	0.212	0.300
Weather	96	0.153	0.201	0.160	0.205	0.172	0.218	0.176	0.216	0.172	0.220	0.177	0.218	0.163	0.212	0.172	0.22	0.153	0.202	0.196	0.255
	192	0.205	0.248	0.208	0.249	0.217	0.256	0.225	0.257	0.219	0.259	0.225	0.259	0.212	0.254	0.219	0.261	0.203	0.249	0.237	0.296
	336	0.263	0.290	0.265	0.291	0.276	0.298	0.281	0.299	0.275	0.298	0.278	0.297	0.272	0.299	0.28	0.306	0.263	0.294	0.283	0.335
	720	0.340	0.340	0.343	0.341	0.356	0.349	0.358	0.350	0.350	0.347	0.354	0.348	0.350	0.348	0.365	0.359	0.344	0.345	0.345	0.381
	Avg	0.240	0.270	0.244	0.272	0.255	0.280	0.260	0.280	0.254	0.281	0.259	0.281	0.246	0.278	0.259	0.287	0.241	0.273	0.265	0.317
ETTm1	96	0.321	0.362	0.328	0.363	0.331	0.368	0.342	0.377	0.334	0.370	0.329	0.367	0.319	0.366	0.338	0.375	0.311	0.352	0.345	0.372
	192	0.354	0.378	0.367	0.382	0.372	0.390	0.383	0.396	0.373	0.390	0.367	0.385	0.376	0.397	0.374	0.387	0.368	0.384	0.38	0.389
	336	0.384	0.400	0.395	0.403	0.411	0.420	0.418	0.418	0.405	0.417	0.399	0.410	0.417	0.422	0.410	0.411	0.395	0.402	0.413	0.413
	720	0.449	0.438	0.454	0.440	0.472	0.453	0.487	0.457	0.471	0.453	0.454	0.439	0.481	0.458	0.478	0.450	0.462	0.440	0.474	0.453
	Avg	0.377	0.395	0.386	0.397	0.397	0.408	0.408	0.412	0.396	0.408	0.387	0.400	0.398	0.411	0.400	0.406	0.384	0.395	0.403	0.407
ETTm2	96	0.177	0.260	0.178	0.261	0.178	0.260	0.186	0.272	0.176	0.261	0.175	0.259	0.247	0.307	0.187	0.267	0.173	0.258	0.193	0.292
	192	0.241	0.300	0.244	0.303	0.248	0.306	0.254	0.314	0.245	0.305	0.241	0.302	0.312	0.346	0.249	0.309	0.246	0.306	0.284	0.362
	336	0.299	0.339	0.302	0.341	0.308	0.342	0.316	0.351	0.304	0.342	0.305	0.343	0.314	0.348	0.321	0.351	0.302	0.341	0.369	0.427
	720	0.394	0.395	0.397	0.396	0.419	0.404	0.414	0.407	0.400	0.398	1.730	1.042	0.414	0.403	0.408	0.522	0.406	0.400	0.421	0.415
	Avg	0.278	0.323	0.280	0.325	0.288	0.328	0.292	0.336	0.281	0.327	0.613	0.487	0.322	0.351	0.358	0.404	0.282	0.326	0.350	0.401
ETTh1	96	0.367	0.392	0.376	0.394	0.380	0.400	0.387	0.405	0.370	0.394	0.414	0.419	0.389	0.411	0.384	0.402	0.374	0.395	0.376	0.400
	192	0.431	0.424	0.439	0.425	0.433	0.432	0.441	0.436	0.432	0.425	0.460	0.445	0.442	0.418	0.436	0.429	0.436	0.421	0.42	0.432
	336	0.472	0.439	0.473	0.440	0.480	0.453	0.491	0.462	0.475	0.443	0.501	0.466	0.480	0.468	0.491	0.469	0.475	0.442	0.481	0.459
	720	0.479	0.461	0.490	0.466	0.547	0.511	0.509	0.494	0.481	0.463	0.500	0.488	0.494	0.488	0.521	0.500	0.476	0.463	0.478	0.453
	Avg	0.437	0.429	0.445	0.432	0.460	0.449	0.457	0.449	0.440	0.431	0.469	0.454	0.451	0.446	0.458	0.450	0.440	0.430	0.433	0.447
ETTh2	96	0.290	0.341	0.293	0.343	0.296	0.342	0.301	0.350	0.294	0.344	0.302	0.348	0.328	0.371	0.34	0.374	0.294	0.346	0.333	0.387
	192	0.364	0.388	0.370	0.390	0.392	0.406	0.380	0.399	0.375	0.391	0.388	0.400	0.402	0.414	0.402	0.414	0.383	0.399	0.477	0.476
	336	0.379	0.406	0.385	0.413	0.428	0.434	0.424	0.432	0.388	0.415	0.426	0.433	0.435	0.443	0.412	0.424	0.413	0.424	0.594	0.541
	720	0.407	0.431	0.419	0.439	0.479	0.470	0.430	0.447	0.423	0.440	0.431	0.446	0.417	0.441	0.462	0.468	0.427	0.440	0.831	0.657
	Avg	0.360	0.391	0.367	0.396	0.399	0.413	0.384	0.407	0.370	0.398	0.387	0.407	0.396	0.417	0.414	0.427	0.379	0.402	0.559	0.515
Solar- Energy	96	0.191	0.225	0.195	0.251	0.198	0.251	0.208	0.238	0.197	0.252	0.234	0.286	0.228	0.263	0.25	0.292	0.312	0.399	0.290	0.378
	192	0.221	0.251	0.227	0.259	0.237	0.259	0.240	0.264	0.228	0.260	0.267	0.31	0.248	0.275	0.296	0.318	0.339	0.416	0.320	0.398
	336	0.252	0.287	0.247	0.275	0.248	0.276	0.249	0.274	0.253	0.287	0.290	0.315	0.291	0.301	0.319	0.330	0.368	0.430	0.353	0.415
	720	0.245	0.281	0.253	0.283	0.261	0.283	0.250	0.275	0.252	0.283	0.289	0.317	0.291	0.306	0.338	0.337	0.370	0.425	0.356	0.413
	Avg	0.228	0.261	0.230	0.267	0.236	0.267	0.237	0.263	0.233	0.271	0.270	0.307	0.265	0.286	0.301	0.319	0.347	0.417	0.330	0.400
Traffic	96	0.421	0.290	0.404	0.274	0.435	0.294	0.392	0.268	0.525	0.342	0.462	0.295	0.594	0.336	0.593	0.321	0.508	0.342	0.65	0.396
	192	0.443	0.292	0.427	0.288	0.451	0.299	0.413	0.277	0.514	0.346	0.466	0.296	0.615	0.347	0.617	0.336	0.609	0.387	0.598	0.370
	336	0.464	0.320	0.440	0.294	0.463	0.302	0.425	0.283	0.531	0.357	0.482	0.304	0.623	0.351	0.629	0.336	0.640	0.402	0.605	0.373
	720	0.487	0.329	0.466	0.307	0.496	0.321	0.460	0.301	0.569	0.373	0.514	0.322	0.608	0.343	0.640	0.35	0.715	0.442	0.645	0.394
	Avg	0.454	0.308	0.434	0.291	0.461	0.304	0.422	0.282	0.535	0.355	0.481	0.304	0.610	0.344	0.620	0.336	0.618	0.393	0.625	0.383

下载: 导出CSV

表 2 消融实验结果

模型	ECL		Weather		ETTh1		ETTh2
模型	MSE	MAE	MSE	MAE	MSE	MAE	MSE	MAE
Fredformer	0.176	0.269	0.244	0.272	0.445	0.432	0.367	0.396
w/o MFA	0.172	0.266	0.242	0.272	0.439	0.431	0.361	0.392
Re MDA	0.170	0.265	0.243	0.271	0.440	0.430	0.363	0.394
MFADA	0.163	0.261	0.240	0.270	0.437	0.429	0.360	0.391

下载: 导出CSV

参考文献(26)

[1]	KONG Xiangjie, CHEN Zhenghao, LIU Weiyao, et al. Deep learning for time series forecasting: A survey[J]. International Journal of Machine Learning and Cybernetics, 2025, 16(5): 5079–5112. doi: 10.1007/s13042-025-02560-w.
[2]	ZHONG Weiyi, ZHAI Dengshuai, XU Wenran, et al. Accurate and efficient daily carbon emission forecasting based on improved ARIMA[J]. Applied Energy, 2024, 376: 124232. doi: 10.1016/j.apenergy.2024.124232.
[3]	潘金伟, 王乙乔, 钟博, 等. 基于统计特征搜索的多元时间序列预测方法[J]. 电子与信息学报, 2024, 46(8): 3276–3284. doi: 10.11999/JEIT231264. PAN Jinwei, WANG Yiqiao, ZHONG Bo, et al. Statistical feature-based search for multivariate time series forecasting[J]. Journal of Electronics & Information Technology, 2024, 46(8): 3276–3284. doi: 10.11999/JEIT231264.
[4]	DA SILVA D G and DE MOURA MENESES A A M. Comparing long short-term memory (LSTM) and bidirectional LSTM deep neural networks for power consumption prediction[J]. Energy Reports, 2023, 10: 3315–3334. doi: 10.1016/j.egyr.2023.09.175.
[5]	郑庆河, 李秉霖, 于治国, 等. 深度学习使能的自动调制分类技术研究进展[J]. 电子与信息学报, 2025, 47(11): 4096–4111. doi: 10.11999/JEIT250674. ZHENG Qinghe, LI Binglin, YU Zhiguo, et al. Research progress of deep learning enabled automatic modulation classification technology[J]. Journal of Electronics & Information Technology, 2025, 47(11): 4096–4111. doi: 10.11999/JEIT250674.
[6]	刘辉, 冯浩然, 马佳妮, 等. 融合空间自注意力感知的严重缺失多元时间序列插补算法[J]. 电子与信息学报, 2025, 47(10): 3917–3928. doi: 10.11999/JEIT250220. LIU Hui, FENG Haoran, MA Jiani, et al. Spatial self-attention incorporated imputation algorithm for severely missing multivariate time series[J]. Journal of Electronics & Information Technology, 2025, 47(10): 3917–3928. doi: 10.11999/JEIT250220.
[7]	RABBANI M B A, MUSARAT M A, ALALOUL W S, et al. A comparison between seasonal autoregressive integrated moving average (SARIMA) and exponential smoothing (ES) based on time series model for forecasting road accidents[J]. Arabian Journal for Science and Engineering, 2021, 46(11): 11113–11138. doi: 10.1007/s13369-021-05650-3.
[8]	WU Haixu, HU Tengge, LIU Yong, et al. TimesNet: Temporal 2D-variation modeling for general time series analysis[C]. The 11th International Conference on Learning Representations, Kigali, Rwanda, 2023.
[9]	COUTINHO E R, MADEIRA J G F, BORGES D G F, et al. Multi-step forecasting of meteorological time series using CNN-LSTM with decomposition methods[J]. Water Resources Management, 2025, 39(7): 3173–3198. doi: 10.1007/s11269-025-04102-z.
[10]	CAI Wanlin, LIANG Yuxuan, LIU Xianggen, et al. MSGNet: Learning multi-scale inter-series correlations for multivariate time series forecasting[C]. The 38th AAAI Conference on Artificial Intelligence, Vancouver, Canada, 2024: 11141–11149. doi: 10.1609/aaai.v38i10.28991.
[11]	YUNITA A, PRATAMA M H D I, ALMUZAKKI M Z, et al. Performance analysis of neural network architectures for time series forecasting: A comparative study of RNN, LSTM, GRU, and hybrid models[J]. MethodsX, 2025, 15: 103462. doi: 10.1016/j.mex.2024.103462.
[12]	YADAV H and THAKKAR A. NOA-LSTM: An efficient LSTM cell architecture for time series forecasting[J]. Expert Systems with Applications, 2024, 238: 122333. doi: 10.1016/j.eswa.2023.122333.
[13]	UBAL C, DI-GIORGI G, CONTRERAS-REYES J E, et al. Predicting the long-term dependencies in time series using recurrent artificial neural networks[J]. Machine Learning and Knowledge Extraction, 2023, 5(4): 1340–1358. doi: 10.3390/make5040068.
[14]	ZENG Ailing, CHEN Muxi, ZHANG Lei, et al. Are transformers effective for time series forecasting?[C]. The 37th AAAI Conference on Artificial Intelligence, Washington, USA, 2023: 11121–11128. doi: 10.1609/aaai.v37i9.26317.
[15]	JIANG Hongwei, LIU Dongsheng, DING Xinyi, et al. TCM: An efficient lightweight MLP-based network with affine transformation for long-term time series forecasting[J]. Neurocomputing, 2025, 617: 128960. doi: 10.1016/j.neucom.2024.128960.
[16]	ZHOU Haoyi, ZHANG Shanghang, PENG Jieqi, et al. Informer: Beyond efficient transformer for long sequence time-series forecasting[C].The 35th AAAI Conference on Artificial Intelligence, Palo Alto, USA, 2021: 11106–11115. doi: 10.1609/aaai.v35i12.17325.
[17]	WU Haixu, XU Jiehui, WANG Jianmin, et al. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting[C]. The 35th Conference on Neural Information Processing Systems, Red Hook, USA, 2021: 22419–22430.
[18]	ZHOU Tian, MA Ziqing, WEN Qingsong, et al. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting[C]. The International Conference on Machine Learning, Baltimore, USA, 2022: 27268–27286.
[19]	NIE Yuqi, NGUYEN N H, SINTHONG P, et al. A time series is worth 64 words: Long-term forecasting with transformers[C]. The 11th International Conference on Learning Representations, Kigali, Rwanda, 2023.
[20]	WU Qiang, YAO Gechang, FENG Zhixi, et al. Peri-midFormer: Periodic pyramid transformer for time series analysis[C]. The 38th International Conference on Neural Information Processing Systems, Vancouver, Canada, 2024: 415. doi: 10.52202/079017-0415.
[21]	LIU Yong, HU Tengge, ZHANG Haoran, et al. iTransformer: Inverted transformers are effective for time series forecasting[C]. The 12th International Conference on Learning Representations, Vienna, Austria, 2024.
[22]	ZHAO Tianlong, FANG Lexin, MA Xiang, et al. TFformer: A time-frequency domain bidirectional sequence-level attention based transformer for interpretable long-term sequence forecasting[J]. Pattern Recognition, 2025, 158: 110994. doi: 10.1016/j.patcog.2024.110994.
[23]	ZHOU Tian, NIU Peisong, WANG Xue, et al. One fits all: Power general time series analysis by pretrained LM[C]. The 37th International Conference on Neural Information Processing Systems, New Orleans, USA, 2023: 1877.
[24]	PIAO Xihao, CHEN Zheng, MURAYAMA T, et al. Fredformer: Frequency debiased transformer for time series forecasting[C]. The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Barcelona, Spain, 2024: 2400–2410. doi: 10.1145/3637528.3671928.
[25]	GAO Shixuan, ZHANG Pingping, YAN Tianyu, et al. Multi-scale and detail-enhanced segment anything model for salient object detection[C]. The 32nd ACM International Conference on Multimedia, Melbourne, Australia, 2024: 9894–9903. doi: 10.1145/3664647.3680650.
[26]	SI Yunzhong, XU Huiying, ZHU Xinzhong, et al. SCSA: Exploring the synergistic effects between spatial and channel attention[J]. Neurocomputing, 2025, 634: 129866. doi: 10.1016/j.neucom.2025.129866.