高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

全局-局部特征融合驱动的抑郁症筛查方法研究

张嗣勇 邱杰凡 赵祥云 肖克江 陈晓甫 毛科技

张嗣勇, 邱杰凡, 赵祥云, 肖克江, 陈晓甫, 毛科技. 全局-局部特征融合驱动的抑郁症筛查方法研究[J]. 电子与信息学报. doi: 10.11999/JEIT250035
引用本文: 张嗣勇, 邱杰凡, 赵祥云, 肖克江, 陈晓甫, 毛科技. 全局-局部特征融合驱动的抑郁症筛查方法研究[J]. 电子与信息学报. doi: 10.11999/JEIT250035
ZHANG Siyong, QIU Jiefan, ZHAO Xiangyun, XIAO Kejiang, CHEN Xiaofu, MAO Keji. Depression Screening Method Driven by Global-Local Feature Fusion[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250035
Citation: ZHANG Siyong, QIU Jiefan, ZHAO Xiangyun, XIAO Kejiang, CHEN Xiaofu, MAO Keji. Depression Screening Method Driven by Global-Local Feature Fusion[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250035

全局-局部特征融合驱动的抑郁症筛查方法研究

doi: 10.11999/JEIT250035 cstr: 32379.14.JEIT250035
基金项目: 国家自然科学基金(62173158, 62377023),浙江工业大学校企联合项目(KYY-HX-20240328)
详细信息
    作者简介:

    张嗣勇:男,硕士生,研究方向为嵌入式操作系统、人工智能和计算机视觉

    邱杰凡:男,博士,副教授,研究方向为嵌入式操作系统、物联网和人工智能

    赵祥云:男,硕士生,研究方向为嵌入式操作系统、人工智能和计算机视觉

    肖克江:男,博士,研究方向为多源信息融合、人工智能、智能感知互联与物联网系统

    陈晓甫:男,硕士,研究方向为人工智能和计算机视觉

    毛科技:男,博士,副教授,研究方向为嵌入式操作系统、物联网和人工智能

    通讯作者:

    肖克江 xiaokj@ccnu.edu.cn

  • 中图分类号: TN911.7; TP391.41

Depression Screening Method Driven by Global-Local Feature Fusion

Funds: The National Natural Science Foundation of China (62173158, 62377023), ZJUT School-enterprise Joint Project (KYY-HX-20240328)
  • 摘要: 目前,基于机器视觉的抑郁症识别筛查的方法往往忽略脸部的局部特征,在实际应用中一旦脸部被部分遮挡,会严重影响筛查的准确性,甚至无法进行有效筛查。为此,该文提出一种边缘视觉的抑郁症筛查方法,该方法通过构建一个全局-局部融合注意力网络同步识别被筛查对象的面部表情和眼部局部特征。为了提高对眼部局部特征的提取能力,该文在网络中引入卷积注意力模块,强化对眼动轨迹特征的捕捉能力。实验结果表明,该方法在抑郁症识别上表现优异,在自建数据集上(包含脸部遮挡情况)的精确率、召回率、F1分数分别达0.76, 0.78和0.77,较最新方法召回率提升10.76%,在AVEC2013和AVEC2014数据集上,平均绝对误差(MAE)分别低至5.74和5.79,较最新方法提升3.53%和1.2%。此外,通过可视化分析直观展现了模型对面部不同区域的关注度,进一步验证了方法的有效性和合理性。该方法部署于边缘设备后,平均处理时延不超过17.56 frame/s,为抑郁症筛查提供新方案。
  • 图  1  GLFAN网络架构图

    图  2  关键点采样

    图  3  反瓶颈结构

    图  4  CBAM结构图

    图  5  轴向注意力结构图

    图  6  抑郁症数据集中的筛查对象

    图  7  Jeston NX,摄像头以及实验场景设置

    图  8  全局和局部分支不同特征提取模块数量的比较结果

    图  9  GLFAN不同标签的分类结果

    图  10  GLFAN不同数据样本的分类结果

    图  11  在自建数据集上的对比热力图

    1  集中点搜索算法

     输入:帧序列集合$ S $
     初始化:初始化$ {S}_{c}\in S $, 当前循环层级$ \mathrm{I}\mathrm{n}\mathrm{d}\mathrm{e}\mathrm{x}=0 $
     REPEAT
     (1)将输入帧平均分成两段$ {S}_{0},{S}_{1}=\mathrm{ }\mathrm{h}\mathrm{a}\mathrm{l}\mathrm{f}\mathrm{S}\mathrm{p}\mathrm{l}\mathrm{i}\mathrm{t}\left({S}_{c}\right) $;
     (2)保留偏移相对较大的一段$ {S}_{c}=\mathrm{ }\mathrm{m}\mathrm{a}\mathrm{x}\_\mathrm{o}\mathrm{f}\mathrm{f}\mathrm{s}\mathrm{e}\mathrm{t}\left(\mathrm{a}\mathrm{b}\mathrm{s}\right({S}_{0},{S}_{1}\left)\right) $;
     (3)$ \mathrm{I}\mathrm{n}\mathrm{d}\mathrm{e}\mathrm{x}+= 1 $;
     UNTIL $ \mathrm{l}\mathrm{e}\mathrm{n}\left({S}_{c}\right) = 1 $;
     输出:输出集中点所在的帧集合$ {S}_{c}; $
    下载: 导出CSV

    表  1  抑郁症数据集样本分布表

    抑郁程度不戴口罩样本数戴口罩样本数视频总时长(min)
    轻度抑郁549366
    中度抑郁746269
    重度抑郁730346
    下载: 导出CSV

    表  2  抑郁症状自评问卷

    序号提问内容
    1你的胃口和体重这两周里有什么变化?
    2最近身体感觉怎么样,有什么异样?
    3你最近的睡眠情况怎么样?
    4你最近和朋友多久交流一次?你最好的朋友如何评价你?
    5你最近的记性如何?你经常忘记事情吗?
    6你最近学习工作中感兴趣的是什么?你怎么集中注意力?
    7你什么时候会感到乏力或缺少动力?经常发生吗?
    8你有想过自杀或者自残之类?如果有是什么原因导致?
    9你最近的情绪状况如何?低落、抑郁、绝望?
    10你现在担心什么事情?你要怎么处理?
    11你什么时候感觉自己是一个非常糟糕的失败者?
    或者你有没有让你自己或者你的家人失望?
    12你何时感到自己的行动、思考和说话变慢了?
    下载: 导出CSV

    表  3  距离度量方法的消融实验结果表

    距离策略精确率召回率F1分数
    余弦相似度0.570.660.611 7
    曼哈顿距离0.580.830.682 8
    切比雪夫距离0.720.660.688 6
    欧式距离0.750.960.857 1
    下载: 导出CSV

    表  4  不同分支方法的消融实验结果表

    方法精确率召回率F1分数
    仅使用全局人脸分支0.740.680.7059
    仅使用局部分支0.630.580.6034
    使用Transformer分类器0.710.740.7242
    GLFAN0.760.780.7721
    下载: 导出CSV

    表  5  GLFAN与现有方法在本文数据集上的实验对照

    方法精确率召回率F1分数
    Shang等人[33]0.701 10.653 20.676 4
    Zhou等人[34]0.794 50.654 60.717 8
    De Melo等人[34]0.690 90.673 60.771 6
    Onyema等人[35]0.669 60.636 60.652 3
    GLFAN0.763 20.781 20.772 1
    下载: 导出CSV

    表  6  GLFAN与其他先进模型在AVEC2013和AVEC2014数据集上的实验对照

    方法 AVEC2013 AVEC2014
    RMSE MAE RMSE MAE
    Song等人[22] 8.10 6.16 7.15 5.95
    Shang等人[33] 8.20 6.38 7.84 6.08
    Zhou等人[34] 8.28 6.20 9.55 7.47
    De Melo等人[34] 7.55 6.24 7.65 6.06
    Pan等人[36] 7.26 5.97 7.30 5.99
    Casado等人[37] 8.01 6.43 8.49 6.57
    Zhang等人[38] 8.08 6.14 7.93 6.35
    Pan等人[39] 7.98 6.15 7.75 6.00
    Dai等人[40] 8.17 8.27
    Xu等人[41] 7.57 5.95 7.18 5.86
    GLFAN 7.34 5.74 7.36 5.79
    下载: 导出CSV

    表  7  不同视频长度下的抑郁症检测平均时延(s)

    网络带宽(Mbit/(s·Hz))5~7 min7~9 min9~11 min11~13 min
    20316445533659
    10453572739902
    55206138901 032
    17328271 0591 337
    下载: 导出CSV
  • [1] WHO. Depression and other common mental disorders: Global health estimates[R]. 2017.
    [2] HERRMAN H, PATEL V, KIELING C, et al. Time for united action on depression: A lancet–world psychiatric association commission[J]. The Lancet, 2022, 399(10328): 957–1022. doi: 10.1016/S0140-6736(21)02141-3.
    [3] THAPAR A, EYRE O, PATEL V, et al. Depression in young people[J]. The Lancet, 2022, 400(10352): 617–631. doi: 10.1016/S0140-6736(22)01012-1.
    [4] ROBERSON K and CARTER R T. The relationship between race-based traumatic stress and the Trauma Symptom Checklist: Does racial trauma differ in symptom presentation?[J]. Traumatology, 2022, 28(1): 120–128. doi: 10.1037/trm0000306.
    [5] CARROZZINO D, PATIERNO C, PIGNOLO C, et al. The concept of psychological distress and its assessment: A clinimetric analysis of the SCL-90-R[J]. International Journal of Stress Management, 2023, 30(3): 235–248. doi: 10.1037/str0000280.
    [6] HAJDUSKA-DÉR B, KISS G, SZTAHÓ D, et al. The applicability of the Beck Depression Inventory and Hamilton Depression Scale in the automatic recognition of depression based on speech signal processing[J]. Frontiers in Psychiatry, 2022, 13: 879896. doi: 10.3389/fpsyt.2022.879896.
    [7] LEE A and PARK J. Diagnostic test accuracy of the beck depression inventory for detecting major depression in adolescents: A systematic review and meta-analysis[J]. Clinical Nursing Research, 2022, 31(8): 1481–1490. doi: 10.1177/10547738211065105.
    [8] 张五芳, 马宁, 王勋, 等. 2020年全国严重精神障碍患者管理治疗现状分析[J]. 中华精神科杂志, 2022, 55(2): 122–128. doi: 10.3760/cma.j.cn113661-20210818-00252.

    ZHANG Wufang, MA Ning, WANG Xun, et al. Management and services for psychosis in the People′s Republic of China in 2020[J]. Chinese Journal of Psychiatry, 2022, 55(2): 122–128. doi: 10.3760/cma.j.cn113661-20210818-00252.
    [9] BABU N V and KANAGA E G M. Sentiment analysis in social media data for depression detection using artificial intelligence: A review[J]. SN Computer Science, 2022, 3(1): 74. doi: 10.1007/s42979-021-00958-1.
    [10] EID M M, YUNDONG W, MENSAH G B, et al. Treating psychological depression utilising artificial intelligence: AI for precision medicine-focus on procedures[J]. Mesopotamian Journal of Artificial Intelligence in Healthcare, 2023, 2023: 76–81. doi: 10.58496/MJAIH/2023/015.
    [11] SANCHEZ A, VAZQUEZ C, MARKER C, et al. Attentional disengagement predicts stress recovery in depression: An eye-tracking study[J]. Journal of Abnormal Psychology, 2013, 122(2): 303–313. doi: 10.1037/a0031529.
    [12] ZHANG Kaipeng, ZHANG Zhanpeng, LI Zhifeng, et al. Joint face detection and alignment using multitask cascaded convolutional networks[J]. IEEE signal processing letters, 2016, 23(10): 1499–1503. doi: 10.1109/LSP.2016.2603342.
    [13] WANG Yang, WANG Hongyuan, and XIN Zihao. Efficient detection model of steel strip surface defects based on YOLO-V7[J]. IEEE Access, 2022, 10: 133936–133944. doi: 10.1109/ACCESS.2022.3230894.
    [14] HO J, KALCHBRENNER N, WEISSENBORN D, et al. Axial attention in multidimensional transformers[J]. arXiv preprint arXiv: 1912.12180, 2019.
    [15] LIU Zhuang, MAO Hanzi, WU Chaoyuan, et al. A ConvNet for the 2020s[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 11966–11976. doi: 10.1109/CVPR52688.2022.01167.
    [16] WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]. The 15th European Conference on Computer Vision (ECCV), Munich, Germany, 2018: 3–19. doi: 10.1007/978-3-030-01234-2_1.
    [17] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778. doi: 10.1109/CVPR.2016.90.
    [18] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization[C]. The IEEE International Conference on Computer Vision, Venice, Italy, 2017: 618–626. doi: 10.1109/ICCV.2017.74.
    [19] ZHOU Xiuzhuang, JIN Kai, SHANG Yuanyuan, et al. Visually interpretable representation learning for depression recognition from facial images[J]. IEEE Transactions on Affective Computing, 2020, 11(3): 542–552. doi: 10.1109/TAFFC.2018.2828819.
    [20] DE MELO W C, GRANGER E, and HADID A. Depression detection based on deep distribution learning[C]. 2019 IEEE International Conference on Image Processing (ICIP), Taipei, China, 2019: 4544–4548. doi: 10.1109/ICIP.2019.8803467.
    [21] DE MELO W C, GRANGER E, and LOPEZ M B. Encoding temporal information for automatic depression recognition from facial analysis[C]. ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020: 1080–1084. doi: 10.1109/ICASSP40776.2020.9054375.
    [22] SONG Siyang, JAISWAL S, SHEN Linlin, et al. Spectral representation of behaviour primitives for depression analysis[J]. IEEE Transactions on Affective Computing, 2022, 13(2): 829–844. doi: 10.1109/TAFFC.2020.2970712.
    [23] 于明, 徐心怡, 师硕, 等. 基于面部深度空时特征的抑郁症识别算法[J]. 电视技术, 2020, 44(11): 12–18. doi: 10.16280/j.videoe.2020.11.04.

    YU Ming, XU Xinyi, SHI Shuo, et al. Depression recognition algorithm based on facial deep spatio-temporal features[J]. Video Engineering, 2020, 44(11): 12–18. doi: 10.16280/j.videoe.2020.11.04.
    [24] HE Lang, CHAN J C W, and WANG Zhongmin. Automatic depression recognition using CNN with attention mechanism from videos[J]. Neurocomputing, 2021, 422: 165–175. doi: 10.1016/j.neucom.2020.10.015.
    [25] NIU Mingyue, TAO Jianhua, and LIU Bin. Multi-scale and multi-region facial discriminative representation for automatic depression level prediction[C]. The ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing, Toronto, Canada, 2021: 1325–1329. doi: 10.1109/ICASSP39728.2021.9413504.
    [26] ZHENG Yajing, WU Xiaohang, LIN Xiaoming, et al. The prevalence of depression and depressive symptoms among eye disease patients: A systematic review and meta-analysis[J]. Scientific Reports, 2017, 7(1): 46453. doi: 10.1038/srep46453.
    [27] BALTRUSAITIS T, ZADEH A, LIM Y C, et al. OpenFace 2.0: Facial behavior analysis toolkit[C]. 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018). Xi’an, China, 2018: 59–66. doi: 10.1109/FG.2018.00019.
    [28] BULAT A and TZIMIROPOULOS G. How far are we from solving the 2D & 3D face alignment problem? (and a dataset of 230, 000 3D facial landmarks)[C]. The IEEE International Conference on Computer Vision, Venice, Italy, 2017: 1021–1030. doi: 10.1109/ICCV.2017.116.
    [29] HAN Kai, XIAO An, WU Enhua, et al. Transformer in transformer[C]. The 35th International Conference on Neural Information Processing Systems, 2021: 1217.
    [30] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[C]. The 9th International Conference on Learning Representations, Austria, 2021.
    [31] DEROGATIS L R and CLEARY P A. Confirmation of the dimensional structure of the scl‐90: A study in construct validation[J]. Journal of Clinical Psychology, 1977, 33(4): 981–989. doi: 10.1002/1097-4679(197710)33:4<981::AID-JCLP2270330412>3.0.CO;2-0.
    [32] 美国精神病学协会, 张道龙, 等译. 精神障碍诊断和统计手册[M]. 5版. 北京: 北京大学出版社, 2015: 70–71.

    American Psychiatric Association, ZHANG Daolong, et al. translation. Diagnostic and Statistical Manual of Mental Disorders[M]. 5th ed. Beijing: Peking University Press, 2015: 70–71.
    [33] SHANG Yuanyuan, PAN Yuchen, JIANG Xiao, et al. LQGDNet: A local quaternion and global deep network for facial depression recognition[J]. IEEE Transactions on Affective Computing, 2023, 14(3): 2557–2563. doi: 10.1109/TAFFC.2021.3139651.
    [34] DE MELO W C, GRANGER E, and LÓPEZ M B. MDN: A deep maximization-differentiation network for spatio-temporal depression detection[J]. IEEE Transactions on Affective Computing, 2023, 14(1): 578–590. doi: 10.1109/TAFFC.2021.3072579.
    [35] ONYEMA E M, SHUKLA P K, DALAL S, et al. Enhancement of patient facial recognition through deep learning algorithm: ConvNet[J]. Journal of Healthcare Engineering, 2021, 2021(1): 5196000. doi: 10.1155/2021/5196000.
    [36] PAN Yuchen, SHANG Yuanyuan, SHAO Zhuhong, et al. Integrating deep facial priors into landmarks for privacy preserving multimodal depression recognition[J]. IEEE Transactions on Affective Computing, 2024, 15(3): 828–836. doi: 10.1109/TAFFC.2023.3296318.
    [37] CASADO C Á, CAÑELLAS M L, and LÓPEZ M B. Depression recognition using remote photoplethysmography from facial videos[J]. IEEE Transactions on Affective Computing, 2023, 14(4): 3305–3316. doi: 10.1109/TAFFC.2023.3238641.
    [38] ZHANG Shiqing, ZHANG Xingnan, ZHAO Xiaoming, et al. MTDAN: A lightweight multi-scale temporal difference attention networks for automated video depression detection[J]. IEEE Transactions on Affective Computing, 2024, 15(3): 1078–1089. doi: 10.1109/TAFFC.2023.3312263.
    [39] PAN Yuchen, SHANG Yuanyuan, LIU Tie, et al. Spatial–temporal attention network for depression recognition from facial videos[J]. Expert Systems with Applications, 2024, 237: 121410. doi: 10.1016/j.eswa.2023.121410.
    [40] DAI Ziqian, LI Qiuping, SHANG Yichen, et al. Depression detection based on facial expression, audio and gait[C]. 2023 IEEE 6th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Chongqing, China, 2023: 1568–1573. doi: 10.1109/ITNEC56291.2023.10082163.
    [41] XU Jiaqi, GUNES H, KUSUMAM K, et al. Two-stage temporal modelling framework for video-based depression recognition using graph representation[J]. IEEE Transactions on Affective Computing, 2025, 16(1): 161–178. doi: 10.1109/TAFFC.2024.3415770.
  • 加载中
图(11) / 表(8)
计量
  • 文章访问数:  196
  • HTML全文浏览量:  105
  • PDF下载量:  14
  • 被引次数: 0
出版历程
  • 收稿日期:  2025-01-16
  • 修回日期:  2025-07-14
  • 网络出版日期:  2025-07-22

目录

    /

    返回文章
    返回