高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于改进YOLO12n的人脸表情识别模型

韩闯 黄静垚 兰朝凤

韩闯, 黄静垚, 兰朝凤. 基于改进YOLO12n的人脸表情识别模型[J]. 电子与信息学报. doi: 10.11999/JEIT250936
引用本文: 韩闯, 黄静垚, 兰朝凤. 基于改进YOLO12n的人脸表情识别模型[J]. 电子与信息学报. doi: 10.11999/JEIT250936
HAN Chuang, HUANG Jingyao, LAN Chaofeng. Facial Expression Recognition Model based on Improved YOLO12n[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250936
Citation: HAN Chuang, HUANG Jingyao, LAN Chaofeng. Facial Expression Recognition Model based on Improved YOLO12n[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250936

基于改进YOLO12n的人脸表情识别模型

doi: 10.11999/JEIT250936 cstr: 32379.14.JEIT250936
基金项目: 黑龙江省省属本科高校优秀青年教师基础研究支持计划资助(YQJH2024077)
详细信息
    作者简介:

    韩闯:男,副教授,研究方向为图像识别及小目标定位

    黄静垚:女,硕士生,研究方向为图像处理及目标检测

    兰朝凤:女,教授,研究方向为图像多目标分类跟踪

    通讯作者:

    兰朝凤 lanchaofeng@ hrbust.edu.cn

  • 中图分类号: TP391.41; TP183

Facial Expression Recognition Model based on Improved YOLO12n

Funds: Supported By Program for Young Talents of Basic Research in Universities of Heilongjiang Province (YQJH2024077)
  • 摘要: 针对低分辨率、光照复杂、部分遮挡等场景下人脸表情识别精度下降的问题,本文提出一种基于YOLO12n改进的人脸表情识别模型YOLO-FER (Facial Expression Recognition)。该模型通过设计NewStarBlock模块优化原有C3k2瓶颈结构以缓解高维特征缺失,并引入多维协作注意力(Multidimensional Collaborative Attention, MCA)模块在通道、高度、宽度三个维度协同建模以增强细粒度特征提取能力,同时增加低分辨率特征增强模块(Low Resolution Feature Extractor, LRFE)提升弱光及模糊场景下的鲁棒性,并采用自适应阈值焦点损失函数(Adaptive Threshold Focal Loss, ATFL)动态调整难易样本权重以缓解类别不平衡问题。实验在RAF-DB和Low Light Dataset数据集上表明,YOLO-FER在mAP@0.5指标上较基线YOLO12n分别提升了3.8%和5.0%,在保持实时检测速度的同时提升了模型的泛化能力与鲁棒性,能够更准确地捕捉表情关键区域,适用于表情识别实际场景。
  • 图  1  YOLO-FER模型结构

    图  2  C3k2改进结构图

    图  3  StarBlock与NewStarBlock模块结构

    图  4  MCA结构图

    图  5  LRFE模块结构图

    图  6  RAF-DB数据集部分样本

    图  7  LLD数据集部分样本

    图  8  改进模型mAP@0.5曲线对比图

    图  9  归一化混淆矩阵对比图

    图  10  不同激活函数mAP@0.5对比图及PR曲线对比图

    图  11  不同扩张率mAP@0.5对比图及PR曲线对比图

    图  12  不同λ mAP@0.5对比图及PR曲线对比图

    图  13  热力图对比图

    图  14  退化测试集部分测试结果

    表  1  RAF-DB与LLD数据集表情类别分布

    类别(英文)类别(中文)RAF-DB样本数LLD样本数
    Angry生气8671156
    Disgust厌恶8771726
    Fear恐惧355/
    Happy高兴59572064
    Neutral中性32041748
    Sad悲伤24601896
    Surprise惊讶16192158
    下载: 导出CSV

    表  2  基于RAF-DB数据集消融实验

    BaselineC3k2_starA2C2f_MCALRFEATFLP(%)R(%)F1mAP@0.5(%)Params/M
    78.478.60.7883.82.5
    78.880.70.7985.92.5
    78.682.70.8186.92.5
    81.280.80.8186.82.8
    78.781.30.8086.72.5
    80.181.50.8187.02.8
    80.681,70.8187.33.0
    81.881.90.8287.63.0
    下载: 导出CSV

    表  3  基于LLD数据集消融实验

    BaselineC3k2_starA2C2f_MCALRFEATFLP(%)R(%)F1mAP@0.5(%)Params/M
    87.382.80.8590.92.5
    91.087.50.8994.02.5
    89.687.30.8893.22.5
    92.487.40.9094.42.8
    90.187.90.8993.82.5
    89.289.60.9094.22.8
    92.389.40.9195.03.0
    91.991.20.9295.93.0
    下载: 导出CSV

    表  4  相似表情(生气–厌恶)区分能力对比

    模型R(生气)R(厌恶)混淆(生气→厌恶)混淆(厌恶→生气)
    YOLO12n0.690.450.100.10
    YOLO12n+ A2C2f_MCA0.750.590.090.06
    Δ+0.06+0.14-0.01-0.04
    下载: 导出CSV

    表  5  YOLO系列模型对比实验

    模型P(%)R(%)F1mAP@0.5(%)mAP@0.5标准差Params/M
    YOLOv8n81.477.90.7984.10.00353.0
    YOLOV10n71.377.80.7481.60.00622.7
    YOLO11n81.579.40.8085.90.00522.6
    YOLO12n78.478.60.7883.80.00412.5
    YOLO-FER81.881.90.8287.60.00293.0
    下载: 导出CSV

    表  6  主流方法在RAF-DB上的表现结果

    模型Accuracy (%)GFLOP/109Params/M
    RAN-ResNet18[6]86.9014.5511.19
    POSTER++[7]92.218.48.4
    MA-Net[9]88.403.6550.54
    DAN[10]89.702.319.72
    下载: 导出CSV

    表  7  不同模型在 RTX4090 上的计算复杂度与推理速度对比

    模型输入分辨率GFLOPsFPSParams/M
    YOLOv8n640×6408.2603.863.0
    YOLOV10n640×6408.4781.712.7
    YOLO11n640×6406.4677.362.6
    YOLO12n640×6405.8619.492.5
    YOLO-FER640×6407.7503.533.0
    下载: 导出CSV

    表  8  LO-FER 在不同输入分辨率下的计算复杂度与推理速度

    输入分辨率GFLOPsFPS
    320×3207.71291.76
    480×4807.7803.41
    640×6407.7503.53
    下载: 导出CSV

    表  9  超参数取值范围

    模块超参数取值范围基准值
    NewStarBlock激活函数ReLU6, SiLUSiLU
    LRFEdilation(扩张率)1, 2, 32
    ATFLλ(损失调制系数)1.5, 2.0, 2.52.0
    下载: 导出CSV

    表  10  激活函数敏感性分析实验结果

    激活函数P(%)R(%)F1mAP@0.5
    ReLU688.586.30.8793
    SiLU91.991.20.9295.9
    下载: 导出CSV

    表  11  扩张率敏感性分析实验结果

    扩张率dilationP(%)R(%)F1mAP@0.5
    188.987.80.8893.7
    291.991.20.9295.9
    391.891.10.9195.7
    下载: 导出CSV

    表  12  λ敏感性分析实验结果

    λP(%)R(%)F1mAP@0.5
    1.591.991.10.9195.7
    291.991.20.9295.9
    2.589.890.00.9095.0
    下载: 导出CSV

    表  13  原始测试集与随机退化测试集上的性能对比(YOLO-FER/YOLO12n)

    表情类别 生气 厌恶 恐惧 高兴 中性 悲伤 惊讶 平均
    原图 91.1/85.7 67.1/59.2 75.4/66.3 98.6/98.2 92.7/90.8 93.6/92.9 94.6/93.5 87.6/83.8
    退化后 85.6/80.9 66.4/54.2 70.4/58.8 98.1/97.9 88.7/88.2 91.6/90.2 92.5/90.2 84.8/80.1
    Δ –5.5/–4.8 –0.7/–5.0 –5.0/–7.5 –0.5/–0.5 –4.0/–2.6 –2.0/–2.7 –2.1/–3.3 –2.8/–3.7
    下载: 导出CSV
  • [1] ADYAPADY R R and ANNAPPA B. A comprehensive review of facial expression recognition techniques[J]. Multimedia Systems, 2023, 29(1): 73–103. doi: 10.1007/S00530-022-00984-w.
    [2] LI Shan and DENG Weihong. Deep facial expression recognition: A survey[J]. IEEE Transactions on Affective Computing, 2022, 13(3): 1195–1215. doi: 10.1109/taffc.2020.2981446.
    [3] 张国祥, 孙运卓. 复杂光线环境下局部二值模式的CNN人脸识别方法[J]. 湖北师范大学学报: 自然科学版, 2023, 43(4): 49–55. doi: 10.3969/j.issn.2096-3149.2023.04.007.

    ZHANG Guoxiang and SUN Yunzhuo. CNN facialrecognition method based on local binary pattern in complex light environment[J]. Journal of Hubei Normal University: Natural Science, 2023, 43(4): 49–55. doi: 10.3969/j.issn.2096-3149.2023.04.007.
    [4] 李蕊, 刘鹏宇, 贾克斌. 局部遮挡条件下的人脸表情识别[J]. 计算机应用与软件, 2016, 33(9): 147–150,175. doi: 10.3969/j.issn.1000-386x.2016.09.035.

    LI Rui, LIU Pengyu, and JIA Kebin. Facial expression recognition under partial occlusion[J]. Computer Applications and Software, 2016, 33(9): 147–150,175. doi: 10.3969/j.issn.1000-386x.2016.09.035.
    [5] 李珊, 邓伟洪. 深度人脸表情识别研究进展[J]. 中国图象图形学报, 2020, 25(11): 2306–2320. doi: 10.11834/jig.200233.

    LI Shan and DENG Weihong. Deep facial expression recognition: A survey[J]. Journal of Image and Graphics, 2020, 25(11): 2306–2320. doi: 10.11834/jig.200233.
    [6] WANG Kai, PENG Xiaojiang, YANG Jianfei, et al. Region attention networks for pose and occlusion robust facial expression recognition[J]. IEEE Transactions on Image Processing, 2020, 29: 4057–4069. doi: 10.1109/TIP.2019.2956143.
    [7] MAO Jiawei, XU Rui, YIN Xuesong, et al. POSTER++: A simpler and stronger facial expression recognition network[J]. Pattern Recognition, 2025, 157: 110951. doi: 10.1016/J.PATCOG.2024.110951.
    [8] 赵明华, 董爽爽, 胡静, 等. 注意力引导的三流卷积神经网络用于微表情识别[J]. 中国图象图形学报, 2024, 29(1): 111–122. doi: 10.11834/jig.230053.

    ZHAO Minghua, DONG Shuangshuang, HU Jing, et al. Attention-guided three-stream convolutional neural network for microexpression recognition[J]. Journal of Image and Graphics, 2024, 29(1): 111–122. doi: 10.11834/jig.230053.
    [9] YANG Qiaohe, HE Yueshun, CHEN Hongmao, et al. A novel lightweight facial expression recognition network based on deep shallow network fusion and attention mechanism[J]. Algorithms, 2025, 18(8): 473. doi: 10.3390/A18080473.
    [10] WEN Zhengyao, LIN Wenzhong, WANG Tao, et al. Distract your attention: Multi-head cross attention network for facial expression recognition[J]. Biomimetics, 2023, 8(2): 199. doi: 10.3390/BIOMIMETICS8020199.
    [11] LAI Zhenyi, CHEN Renhe, JIA Jinlu, et al. Real-time micro-expression recognition based on ResNet and atrous convolutions[J]. Journal of Ambient Intelligence and Humanized Computing, 2023, 14(11): 15215–15226. doi: 10.1007/s12652-020-01779-5.
    [12] 薛珮芸, 戴书涛, 白静, 等. 借助语音和面部图像的双模态情感识别[J]. 电子与信息学报, 2024, 46(12): 4542–4552. doi: 10.11999/JEIT240087.

    XUE Peiyun, DAI Shutao, BAI Jing, et al. Emotion recognition with speech and facial images[J]. Journal of Electronics & Information Technology, 2024, 46(12): 4542–4552. doi: 10.11999/JEIT240087.
    [13] 张嘉淏, 刘峰, 齐佳音. 一种基于Bottleneck Transformer的轻量级微表情识别架构[J]. 计算机科学, 2022, 49(6A): 370–377. doi: 10.11896/jsjkx.210500023.

    ZHANG Jiahao, LIU Feng, and QI Jiayin. Lightweight micro-expression recognition architecture based on Bottleneck Transformer[J]. Computer Science, 2022, 49(6A): 370–377. doi: 10.11896/jsjkx.210500023.
    [14] 张鹏, 孔韦韦, 滕金保. 基于多尺度特征注意力机制的人脸表情识别[J]. 计算机工程与应用, 2022, 58(1): 182–189. doi: 10.3778/j.issn.1002-8331.2106-0174.

    ZHANG Peng, KONG Weiwei, and TENG Jinbao. Facial expression recognition based on multi-scale feature attention mechanism[J]. Computer Engineering and Applications, 2022, 58(1): 182–189. doi: 10.3778/j.issn.1002-8331.2106-0174.
    [15] 邵延华, 张铎, 楚红雨, 等. 基于深度学习的YOLO目标检测综述[J]. 电子与信息学报, 2022, 44(10): 3697–3708. doi: 10.11999/JEIT210790.

    SHAO Yanhua, ZHANG Duo, CHU Hongyu, et al. A review of YOLO object detection based on deep learning[J]. Journal of Electronics & Information Technology, 2022, 44(10): 3697–3708. doi: 10.11999/JEIT210790.
    [16] MA Xu, DAI Xiyang, BAI Yue, et al. Rewrite the stars[C]. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2024: 5694–5703. doi: 10.1109/CVPR52733.2024.00544.
    [17] YU Yang, ZHANG Yi, CHENG Zeyu, et al. MCA: Multidimensional collaborative attention in deep convolutional neural networks for image recognition[J]. Engineering Applications of Artificial Intelligence, 2023, 126: 107079. doi: 10.1016/j.engappai.2023.107079.
    [18] YANG Bo, ZHANG Xinyu, ZHANG Jian, et al. EFLNet: Enhancing feature learning network for infrared small target detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5906511. doi: 10.1109/TGRS.2024.3365677.
    [19] LI Shan and DENG Weihong. Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition[J]. IEEE Transactions on Image Processing, 2019, 28(1): 356–370. doi: 10.1109/TIP.2018.2868382.
    [20] Emotiscore. Low light dataset computer vision model[EB/OL]. https://universe.roboflow.com/emotiscore/low-light-dataset, 2025. (查阅网上资料,未找到本条文献年份信息,请确认).
  • 加载中
图(14) / 表(13)
计量
  • 文章访问数:  24
  • HTML全文浏览量:  3
  • PDF下载量:  8
  • 被引次数: 0
出版历程
  • 修回日期:  2026-04-08
  • 录用日期:  2026-04-08
  • 网络出版日期:  2026-04-28

目录

    /

    返回文章
    返回