A Multi-view Feature Extraction and Dual-edge Contrastive Learning Approach for Image Forgery Detection

XU Zhuang; YE Ziyi; PAN Enkang; LIU Chunxiao

doi:10.11999/JEIT251271

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2025 >

XU Zhuang, YE Ziyi, PAN Enkang, LIU Chunxiao. A Multi-view Feature Extraction and Dual-edge Contrastive Learning Approach for Image Forgery Detection[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT251271

Citation:

XU Zhuang, YE Ziyi, PAN Enkang, LIU Chunxiao. A Multi-view Feature Extraction and Dual-edge Contrastive Learning Approach for Image Forgery Detection[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT251271

Citation:

PDF( 1698 KB)

A Multi-view Feature Extraction and Dual-edge Contrastive Learning Approach for Image Forgery Detection

doi: 10.11999/JEIT251271 cstr: 32379.14.JEIT251271

XU Zhuang¹,
YE Ziyi¹,
PAN Enkang¹,
LIU Chunxiao^{1, 2
,
,}

1.
School of Computer Science and Technology, Zhejiang Gongshang University, Hangzhou 310018, China
2.
Zhejiang Key Laboratory of Big Data and Future E-Commerce Technology, Hangzhou 310018, China

Funds: The National Natural Science Foundation of China (61976188), Zhejiang Provincial Natural Science Foundation of China (LY24F020004), The National College Students Innovation and Entrepreneurship Training Program (202510353027), Zhejiang Provincial College Students Innovation and Entrepreneurship Training Program (S202510353076)

Received Date: 2025-12-01
Accepted Date: 2026-05-12
Rev Recd Date: 2026-04-29

Available Online: 2026-05-27

Abstract

Abstract

Objective With the rapid development and wide use of image editing tools, such as Adobe Photoshop and Meitu, realistic forged images can now be created and disseminated with increasing ease. This trend poses challenges to visual content authentication in journalism, forensic analysis, and social security. Existing image forgery detection methods usually define the task as pixel-wise binary classification. This formulation may cause label conflicts, especially when the same object has different labels in different images. In addition, most methods mainly focus on spatial-domain features and make limited use of complementary information from other views, such as noise-domain clues. Methods To address these limitations, this paper proposes an image forgery detection algorithm based on multi-view feature extraction and dual-edge contrastive learning. The detection task is reformulated as intra-image inconsistency detection, which avoids label conflicts caused by conventional pixel-wise classification. To reduce semantic ambiguity near tampered boundaries, a dual-edge contrastive learning strategy is designed. Inner-edge and outer-edge features are extracted and contrasted separately, and non-edge tampered and non-tampered features are also contrasted. This strategy guides the model to focus on difficult edge samples and improves boundary detection accuracy. A dual-branch multi-view feature encoder is further developed to extract complementary forgery clues. The spatial-domain branch uses a High-Resolution Network (HRNet) backbone to extract multi-scale spatial features. A mixture-of-experts gating mechanism dynamically weights features across scales and fuses residuals between adjacent scales, which helps capture subtle forgery traces. The noise-domain branch extracts multiple noise-related features, including noise fingerprint features, Spatial Rich Model (SRM) filter responses, Bayar convolution features, max-pooling features, average-pooling residuals, and learnable Fourier-domain features with adaptive masking. A mixture-of-experts strategy is also used to dynamically assign weights to these heterogeneous features according to the characteristics of each input image. During training, the fused multi-view features are optimized using the dual-edge contrastive learning framework, which strengthens discrimination between tampered and non-tampered regions, particularly near their boundaries. During inference, K-means clustering is applied to the learned feature representations to locate tampered regions without explicit pixel labels. Results and Discussions Extensive experiments are conducted on widely used benchmark datasets, including NIST, Columbia, COVERAGE, DSO, and CASIA-v1. These datasets cover different forgery types, including splicing, copy-move, object removal, and post-processing. The proposed method consistently outperforms state-of-the-art methods. Compared with the best existing methods, it improves the average permuted F1 (pF1) and permuted Intersection over Union (pIoU) by 26.0% and 10.1%, respectively (Table 3). Visualization results show more accurate localization of tampered regions, especially along tampered boundaries, with fewer false positives and clearer edge delineation (Fig. 5). Ablation studies further verify the effectiveness of each key component, including multi-view feature extraction, the mixture-of-experts fusion mechanism for noise features, and the dual-edge contrastive learning strategy (Tables 4～6). Conclusions This paper presents an image forgery detection framework that addresses the limitations of conventional classification-based methods by modeling the task as intra-image inconsistency detection. Dual-edge contrastive learning reduces semantic ambiguity at tampered boundaries, and the multi-view feature encoder extracts complementary spatial-domain and noise-domain clues. Experimental results on different datasets show improved detection accuracy and boundary precision. Future work will explore the extension of the inconsistency detection paradigm to additional modalities, such as text, for multimodal forgery detection.
- Image forgery detection,
- Contrastive learning,
- Multi-view feature extraction,
- Edge refinement

FullText(HTML)

References(30)

References

[1]	FARID H and LYU Siwei. Higher-order wavelet statistics and their application to digital forensics[C]. 2003 Conference on Computer Vision and Pattern Recognition Workshop, Madison, USA, 2003: 94. doi: 10.1109/CVPRW.2003.10093.
[2]	NIU Yakun, TONDI B, ZHAO Yao, et al. Image splicing detection, localization and attribution via JPEG primary quantization matrix estimation and clustering[J]. IEEE Transactions on Information Forensics and Security, 2021, 16: 5397–5412. doi: 10.1109/TIFS.2021.3129654.
[3]	PYATYKH S, HESSER J, and ZHENG Lei. Image noise level estimation by principal component analysis[J]. IEEE Transactions on Image Processing, 2013, 22(2): 687–699. doi: 10.1109/TIP.2012.2221728.
[4]	ZORAN D and WEISS Y. Scale invariance and noise in natural images[C]. The 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan, 2009: 2209–2216. doi: 10.1109/ICCV.2009.5459476.
[5]	毕秀丽, 魏杨, 肖斌, 等. 基于级联卷积神经网络的图像篡改检测算法[J]. 电子与信息学报, 2019, 41(12): 2987–2994. doi: 10.11999/JEIT190043. BI Xiuli, WEI Yang, XIAO Bin, et al. Image forgery detection algorithm based on cascaded convolutional neural network[J]. Journal of Electronics & Information Technology, 2019, 41(12): 2987–2994. doi: 10.11999/JEIT190043.
[6]	LIU Xiaohong, LIU Yaojie, CHEN Jun, et al. PSCC-Net: Progressive spatio-channel correlation network for image manipulation detection and localization[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(11): 7505–7517. doi: 10.1109/TCSVT.2022.3189545.
[7]	QU Chenfan, ZHONG Yiwu, LIU Chongyu, et al. Towards modern image manipulation localization: A large-scale dataset and novel methods[C]. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2024: 10781–10790. doi: 10.1109/CVPR52733.2024.01025.
[8]	ZHU Xuekang, MA Xiaochen, SU Lei, et al. Mesoscopic insights: Orchestrating multi-scale & hybrid architecture for image manipulation localization[C]. The 39th AAAI Conference on Artificial Intelligence, Philadelphia, USA, 2025: 11022–11030. doi: 10.1609/aaai.v39i10.33198.
[9]	SU Lei, MA Xiaochen, ZHU Xuekang, et al. Can we get rid of handcrafted feature extractors? SparseViT: Nonsemantics-centered, parameter-efficient image manipulation localization through spare-coding transformer[C]. The 39th AAAI Conference on Artificial Intelligence, Philadelphia, USA, 2025: 7024–7032. doi: 10.1609/aaai.v39i7.32754.
[10]	李树原, 严彩萍, 李红. 用于图像篡改检测的混合Transformer网络[J]. 计算机辅助设计与图形学学报, 2024, 36(12): 2010–2019. doi: 10.3724/SP.J.1089.2024.20099. LI Shuyuan, YAN Caiping, and LI Hong. A hybrid Transformer network for image splicing forgery detection[J]. Journal of Computer-Aided Design & Computer Graphics, 2024, 36(12): 2010–2019. doi: 10.3724/SP.J.1089.2024.20099.
[11]	KONG Chenqi, LUO Anwei, WANG Shiqi, et al. Pixel-inconsistency modeling for image manipulation localization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025, 47(6): 4455–4472. doi: 10.1109/TPAMI.2025.3541028.
[12]	KWON M J, LEE W, NAM S H, et al. SAFIRE: Segment any forged image region[C]. The 39th AAAI Conference on Artificial Intelligence, Philadelphia, USA, 2025: 4437–4445. doi: 10.1609/aaai.v39i4.32467.
[13]	LOU Zijie, CAO Gang, GUO Kun, et al. Exploring multi-view pixel contrast for general and robust image forgery localization[J]. IEEE Transactions on Information Forensics and Security, 2025, 20: 2329–2341. doi: 10.1109/TIFS.2025.3541957.
[14]	PENG Rongxuan, TAN Shunquan, MO Xianbo, et al. Employing reinforcement learning to construct a decision-making environment for image forgery localization[J]. IEEE Transactions on Information Forensics and Security, 2024, 19: 4820–4834. doi: 10.1109/TIFS.2024.3381470.
[15]	马杰, 钟斌斌, 焦亚男. 基于极坐标正弦变换的Copy-move篡改检测[J]. 电子与信息学报, 2020, 42(5): 1172–1178. doi: 10.11999/JEIT190481. MA Jie, ZHONG Binbin, and JIAO Yanan. Copy-move forgeries detection based on polar sine transform[J]. Journal of Electronics & Information Technology, 2020, 42(5): 1172–1178. doi: 10.11999/JEIT190481.
[16]	王青, 张荣. 基于DCT系数双量化映射关系的图像盲取证算法[J]. 电子与信息学报, 2014, 36(9): 2068–2074. doi: 10.3724/SP.J.1146.2013.01488. WANG Qing and ZHANG Rong. Exposing digital image forgeries based on double quantization mapping relation of DCT coefficient[J]. Journal of Electronics & Information Technology, 2014, 36(9): 2068–2074. doi: 10.3724/SP.J.1146.2013.01488.
[17]	KWON M J, YU I J, NAM S H, et al. CAT-Net: Compression artifact tracing network for detection and localization of image splicing[C]. 2021 IEEE Winter Conference on Applications of Computer Vision, Waikoloa, USA, 2021: 375–384. doi: 10.1109/WACV48630.2021.00042.
[18]	WU Yue, ABDALMAGEED W, and NATARAJAN P. ManTra-Net: Manipulation tracing network for detection and localization of image forgeries with anomalous features[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 9535–9544. doi: 10.1109/CVPR.2019.00977.
[19]	DONG Chengbo, CHEN Xinru, HU Ruohan, et al. MVSS-Net: Multi-view multi-scale supervised networks for image manipulation detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(3): 3539–3553. doi: 10.1109/TPAMI.2022.3180556.
[20]	COZZOLINO D and VERDOLIVA L. Noiseprint: A CNN-based camera model fingerprint[J]. IEEE Transactions on Information Forensics and Security, 2020, 15: 144–159. doi: 10.1109/TIFS.2019.2916364.
[21]	GUILLARO F, COZZOLINO D, SUD A, et al. TruFor: Leveraging all-round clues for trustworthy image forgery detection and localization[C]. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, Canada, 2023: 20606–20615. doi: 10.1109/CVPR52729.2023.01974.
[22]	ZHU Jiaying, LI Dong, FU Xueyang, et al. A lottery ticket hypothesis approach with sparse fine-tuning and MAE for image forgery detection and localization[C]. The 39th AAAI Conference on Artificial Intelligence, Philadelphia, USA, 2025: 10968–10976. doi: 10.1609/aaai.v39i10.33192.
[23]	GUAN Haiying, KOZAK M, ROBERTSON E, et al. MFC datasets: Large-scale benchmark datasets for media forensic challenge evaluation[C]. 2019 IEEE Winter Applications of Computer Vision Workshops, Waikoloa, USA, 2019: 63–72. doi: 10.1109/WACVW.2019.00018.
[24]	HSU Y F and CHANG S F. Detecting image splicing using geometry invariants and camera characteristics consistency[C]. 2006 IEEE International Conference on Multimedia and Expo, Toronto, Canada, 2006: 549–552. doi: 10.1109/ICME.2006.262447.
[25]	WEN Bihan, ZHU Ye, SUBRAMANIAN R, et al. COVERAGE — a novel database for copy-move forgery detection[C]. 2016 IEEE International Conference on Image Processing, Phoenix, USA, 2016: 161–165. doi: 10.1109/ICIP.2016.7532339.
[26]	DE CARVALHO T J, RIESS C, ANGELOPOULOU E, et al. Exposing digital image forgeries by illumination color classification[J]. IEEE Transactions on Information Forensics and Security, 2013, 8(7): 1182–1194. doi: 10.1109/TIFS.2013.2265677.
[27]	DONG Jing, WANG Wei, and TAN Tieniu. CASIA image tampering detection evaluation database[C]. 2013 IEEE China Summit and International Conference on Signal and Information Processing, Beijing, China, 2013: 422–426. doi: 10.1109/ChinaSIP.2013.6625374.
[28]	WANG Jingdong, SUN Ke, CHENG Tianheng, et al. Deep high-resolution representation learning for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(10): 3349–3364. doi: 10.1109/TPAMI.2020.2983686.
[29]	KWON M J, NAM S H, YU I J, et al. Learning JPEG compression artifacts for image manipulation detection and localization[J]. International Journal of Computer Vision, 2022, 130(8): 1875–1895. doi: 10.1007/s11263-022-01617-5.
[30]	NOVOZÁMSKÝ A, MAHDIAN B, and SAIC S. IMD2020: A large-scale annotated dataset tailored for detecting manipulated images[C]. 2020 IEEE Winter Applications of Computer Vision Workshops, Snowmass, USA, 2020: 71–80. doi: 10.1109/WACVW50321.2020.9096940.