高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

联合聚焦度量与上下文引导滤波的聚焦深度估计

蒋颖 邓慧萍 向森 吴谨

蒋颖, 邓慧萍, 向森, 吴谨. 联合聚焦度量与上下文引导滤波的聚焦深度估计[J]. 电子与信息学报. doi: 10.11999/JEIT250540
引用本文: 蒋颖, 邓慧萍, 向森, 吴谨. 联合聚焦度量与上下文引导滤波的聚焦深度估计[J]. 电子与信息学报. doi: 10.11999/JEIT250540
JIANG Ying, DENG Huiping, XIANG Sen, WU Jin. Joint Focus Measure and Context-Guided Filtering for Depth From Focus[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250540
Citation: JIANG Ying, DENG Huiping, XIANG Sen, WU Jin. Joint Focus Measure and Context-Guided Filtering for Depth From Focus[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250540

联合聚焦度量与上下文引导滤波的聚焦深度估计

doi: 10.11999/JEIT250540 cstr: 32379.14.JEIT250540
详细信息
    作者简介:

    蒋颖:女,硕士生,研究方向为图形图像处理、深度估计

    邓慧萍:女,副教授,研究方向为3D视频与图像的处理、机器学习、3维信息测量、视频图像质量评估

    向森:男,副教授,研究方向为3D视频与图像的处理、机器学习、3维信息测量、视频图像质量评估

    吴谨:女,教授,研究方向为图像处理与模式识别、信号处理与多媒体通信、检测技术与自动化装置

    通讯作者:

    邓慧萍 denghuiping@wust.edu.cn

  • 中图分类号: TN911.73; TP391.41

Joint Focus Measure and Context-Guided Filtering for Depth From Focus

  • 摘要: 聚焦深度估计(DFF)通过分析图像中各像素的焦点变化来推断场景深度,任务的关键在于定位焦堆栈中的最佳聚焦像素点。然而弱纹理区域中焦点变化通常较为细微,导致聚焦区域检测困难,影响深度估计的准确性。因此,该文提出了一种结合聚焦度量与上下文信息的焦堆栈深度估计网络,方法能够精确识别焦堆栈中最佳聚焦像素,并推断出可靠的深度图。文中将聚焦度量算子概念融入深度学习框架,通过增强聚焦区域的特征表达,提升网络对弱纹理区域细微焦点变化的感知能力。此外,文章引入语义上下文引导机制,利用整合的场景语义信息,指导焦点体积的滤波优化。这使网络能同时捕获局部焦点细节与全局上下文信息,实现对弱纹理区域焦点状态的全面推断。综合实验结果,所提出的模型在主观质量和客观指标相比其他算法都有明显提升,具有良好的泛化能力。
  • 图  1  网络整体框架

    图  2  加入聚焦区域检测后与其他方法对比

    图  3  不同阈值下聚焦与非聚焦图的响应差异

    图  4  窗口大小对聚焦响应的影响

    图  5  利用上下文信息感知弱纹理聚焦程度

    图  6  语义特征提取网络

    图  7  语义上下文引导模块

    图  8  在FoD500上的定性结果

    图  9  在DDFF-12上的定性结果

    图  10  Mobile depth数据集上的定性结果

    表  1  FoD500数据集各算法指标评估

    方法MSE↓RMS↓logRMS↓Abs.rel.↓Sqr.rel.↓δ↑δ2δ3Bump.↓avgUnc.↓时间(ms)↓
    VDFF(2015)29.66×e–25.05×e–10.871.1885.62×e–217.9232.6650.311.12--
    RDF(2019)11.15×e–23.22×e–10.710.4623.95×e–239.4864.6576.131.54--
    DDFF(2018)3.34×e–21.67×e–10.270.173.56×e–272.8289.9696.261.74-50.6
    DefousNet(2020)2.18×e–21.34×e–10.240.153.59×e–281.1493.3196.622.52-24.7
    FV(2022)1.88×e–21.25×e–10.210.142.43×e–281.1694.9798.081.450.2418.1
    DFV(2022)2.05×e–21.29×e–10.210.132.39×e–281.9094.6898.051.430.1718.2
    DDFS(2024)3.08×e–21.49×e–10.220.112.55×e–287.0294.3896.99--22.8
    Our1.79×e–21.20×e–10.200.132.29×e–283.2495.4498.211.420.1629.2
    下载: 导出CSV

    表  2  DDFF-12数据集各算法指标评估

    方法MSE↓RMS↓logRMS↓Abs.rel.↓Sqr.rel.↓δ↑δ2δ3Bump.↓avgUnc.↓时间(ms)↓
    VDFF(2015)156.55×e–412.14×e–20.981.38241.2×e–315.2629.4644.890.43--
    RDF(2019)91.18×e–49.41×e–20.911.00139.4×e–315.6533.0847.481.33--
    DDFF(2018)8.97×e–42.76×e–20.280.249.47×e–361.2688.7096.490.52-191.7
    DefousNet(2020)8.61×e–42.55×e–20.230.176.00×e–372.5694.1597.920.46-34.3
    FV(2022)6.49×e–42.28×e–20.230.187.10×e–371.9392.8097.860.425.20×e–233.2
    DFV(2022)5.70×e–42.13×e–20.210.176.26×e–376.7494.2398.140.424.99×e–233.3
    Our5.28×e–42.05×e–20.200.165.53×e–377.9595.5998.360.414.47×e–239.7
    下载: 导出CSV

    表  3  消融实验

    模型FMSCGMDDFF-12FoD500
    MSE↓Sqr.rel.↓δ↑MSE↓Sqr.rel.↓δ↑
    Baseline5.70×e–46.26×e–376.742.05×e–22.39×e–281.90
    w/o FM5.31×e–45.98×e–380.012.01×e–22.29×e–282.36
    w/o SCGM5.65×e–46.21×e–376.951.85×e–22.26×e–282.98
    w/ all5.28×e–45.53×e–377.951.79×e–22.29×e–283.24
    下载: 导出CSV

    表  4  各算法参数量比较

    方法参数量(M)↓
    DefocusNet3.728
    DFV19.522
    DDFS32.528
    Baseline20.826
    Baseline+FM+SCGM22.094
    下载: 导出CSV

    表  5  不同通道扩展数量的影响

    扩张率DDFF-12FoD500
    MSE↓Sqr.rel.↓MSE↓Sqr.rel.↓
    15.40×e–46.06×e–32.01×e–22.28×e–2
    25.38×e–45.56×e–31.85×e–22.26×e–2
    35.28×e–45.53×e–31.79×e–22.29×e–2
    下载: 导出CSV
  • [1] XIONG Haolin, MUTTUKURU S, XIAO Hanyuan, et al. Sparsegs: Sparse view synthesis using 3D Gaussian splatting[C]. 2025 International Conference on 3D Vision, Singapore, Singapore, 2025: 1032–1041. doi: 10.1109/3DV66043.2025.00100.
    [2] WESTERMEIER F, BRÜBACH L, WIENRICH C, et al. Assessing depth perception in VR and video see-through AR: A comparison on distance judgment, performance, and preference[J]. IEEE Transactions on Visualization and Computer Graphics, 2024, 30(5): 2140–2150. doi: 10.1109/TVCG.2024.3372061.
    [3] ZHOU Xiaoyu, LIN Zhiwei, SHAN Xiaojun, et al. DrivingGaussian: Composite Gaussian splatting for surrounding dynamic autonomous driving scenes[C]. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2024: 21634–21643. doi: 10.1109/CVPR52733.2024.02044.
    [4] 姜文涛, 刘晓璇, 涂潮, 等. 自适应空间异常的目标跟踪[J]. 电子与信息学报, 2022, 44(2): 523–533. doi: 10.11999/JEIT201025.

    JIANG Wentao, LIU Xiaoxuan, TU Chao, et al. Adaptive spatial and anomaly target tracking[J]. Journal of Electronics & Information Technology, 2022, 44(2): 523–533. doi: 10.11999/JEIT201025.
    [5] CHEN Rongshan, SHENG Hao, YANG Da, et al. Pixel-wise matching cost function for robust light field depth estimation[J]. Expert Systems with Applications, 2025, 262: 125560. doi: 10.1016/j.eswa.2024.125560.
    [6] WANG Yingqian, WANG Longguang, LIANG Zhengyu, et al. Occlusion-aware cost constructor for light field depth estimation[C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 19777–19786. doi: 10.1109/CVPR52688.2022.01919.
    [7] KE Bingxin, OBUKHOV A, HUANG Shengyu, et al. Repurposing diffusion-based image generators for monocular depth estimation[C]. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2024: 9492–9502. doi: 10.1109/CVPR52733.2024.00907.
    [8] PATNI S, AGARWAL A, and ARORA C. ECoDepth: Effective conditioning of diffusion models for monocular depth estimation[C]. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2024: 28285–28295. doi: 10.1109/CVPR52733.2024.02672.
    [9] SI Haozhe, ZHAO Bin, WANG Dong, et al. Fully self-supervised depth estimation from defocus clue[C]. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, Canada, 2023: 9140–9149. doi: 10.1109/CVPR52729.2023.00882.
    [10] YANG Xinge, FU Qiang, ELHOSEINY M, et al. Aberration-aware depth-from-focus[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025, 47(9): 7268–7278. doi: 10.1109/TPAMI.2023.3301931.
    [11] JEON H G, SURH J, IM S, et al. Ring difference filter for fast and noise robust depth from focus[J]. IEEE Transactions on Image Processing, 2020, 29: 1045–1060. doi: 10.1109/TIP.2019.2937064.
    [12] FAN Tiantian and YU Hongbin. A novel shape from focus method based on 3D steerable filters for improved performance on treating textureless region[J]. Optics Communications, 2018, 410: 254–261. doi: 10.1016/j.optcom.2017.10.019.
    [13] SURH J, JEON H G, PARK Y, et al. Noise robust depth from focus using a ring difference filter[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 2444–2453. doi: 10.1109/CVPR.2017.262.
    [14] THELEN A, FREY S, HIRSCH S, et al. Improvements in shape-from-focus for holographic reconstructions with regard to focus operators, neighborhood-size, and height value interpolation[J]. IEEE Transactions on Image Processing, 2009, 18(1): 151–157. doi: 10.1109/TIP.2008.2007049.
    [15] MAHMOOD M T and CHOI T S. Nonlinear approach for enhancement of image focus volume in shape from focus[J]. IEEE Transactions on Image Processing, 2012, 21(5): 2866–2873. doi: 10.1109/TIP.2012.2186144.
    [16] MAHMOOD M T. Shape from focus by total variation[C]. IVMSP 2013, Seoul, Korea (South), 2013: 1–4. doi: 10.1109/IVMSPW.2013.6611940.
    [17] MOELLER M, BENNING M, SCHÖNLIEB C, et al. Variational depth from focus reconstruction[J]. IEEE Transactions on Image Processing, 2015, 24(12): 5369–5378. doi: 10.1109/TIP.2015.2479469.
    [18] HAZIRBAS C, SOYER S G, STAAB M C, et al. Deep depth from focus[C]. 14th Asian Conference on Computer Vision, Perth, Australia, 2019: 525–541. doi: 10.1007/978-3-030-20893-6_33.
    [19] CHEN Zhang, GUO Xinqing, LI Siyuan, et al. Deep eyes: Joint depth inference using monocular and binocular cues[J]. Neurocomputing, 2021, 453: 812–824. doi: 10.1016/j.neucom.2020.06.132.
    [20] MAXIMOV M, GALIM K, and LEAL-TAIXÉ L. Focus on defocus: Bridging the synthetic to real domain gap for depth estimation[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 1068–1077. doi: 10.1109/CVPR42600.2020.00115.
    [21] WON C and JEON H G. Learning depth from focus in the wild[C]. 17th European Conference on Computer Vision, Tel Aviv, Israel, 2022: 1–18. doi: 10.1007/978-3-031-19769-7_1.
    [22] WANG N H, WANG Ren, LIU Yulun, et al. Bridging unsupervised and supervised depth from focus via all-in-focus supervision[C]. 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 12601–12611. doi: 10.1109/ICCV48922.2021.01239.
    [23] YANG Fengting, HUANG Xiaolei, and ZHOU Zihan. Deep depth from focus with differential focus volume[C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 12632–12641. doi: 10.1109/CVPR52688.2022.01231.
    [24] 邓慧萍, 盛志超, 向森, 等. 基于语义导向的光场图像深度估计[J]. 电子与信息学报, 2022, 44(8): 2940–2948. doi: 10.11999/JEIT210545.

    DENG Huiping, SHENG Zhichao, XIANG Sen, et al. Depth estimation based on semantic guidance for light field image[J]. Journal of Electronics & Information Technology, 2022, 44(8): 2940–2948. doi: 10.11999/JEIT210545.
    [25] HE Mengfei, YANG Zhiyou, ZHANG Guangben, et al. IIMT-net: Poly-1 weights balanced multi-task network for semantic segmentation and depth estimation using interactive information[J]. Image and Vision Computing, 2024, 148: 105109. doi: 10.1016/j.imavis.2024.105109.
    [26] PERTUZ S, PUIG D, and GARCIA M A. Analysis of focus measure operators for shape-from-focus[J]. Pattern Recognition, 2013, 46(5): 1415–1432. doi: 10.1016/j.patcog.2012.11.011.
    [27] LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 936–944. doi: 10.1109/CVPR.2017.106.
    [28] WU Tianyi, TANG Sheng, ZHANG Rui, et al. CGNet: A light-weight context guided network for semantic segmentation[J]. IEEE Transactions on Image Processing, 2021: 1169–1179. doi: 10.1109/TIP.2020.3042065.
    [29] SUWAJANAKORN S, HERNANDEZ C, and SEITZ S M. Depth from focus with your mobile phone[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 3497–3506. doi: 10.1109/CVPR.2015.7298972.
    [30] FUJIMURA Y, IIYAMA M, FUNATOMI T, et al. Deep depth from focal stack with defocus model for camera-setting invariance[J]. International Journal of Computer Vision, 2024, 132(6): 1970–1985. doi: 10.1007/s11263-023-01964-x.
  • 加载中
图(10) / 表(5)
计量
  • 文章访问数:  29
  • HTML全文浏览量:  17
  • PDF下载量:  0
  • 被引次数: 0
出版历程
  • 收稿日期:  2025-06-09
  • 修回日期:  2025-10-09
  • 网络出版日期:  2025-10-16

目录

    /

    返回文章
    返回