留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

面向嵌入式平台的轻量化神经网络手势识别方法

杨晨奕 何玉青 赵俊媛 李国荣

杨晨奕, 何玉青, 赵俊媛, 等. 面向嵌入式平台的轻量化神经网络手势识别方法[J]. 强激光与粒子束, 2022, 34: 031023. doi: 10.11884/HPLPB202234.210335
引用本文: 杨晨奕, 何玉青, 赵俊媛, 等. 面向嵌入式平台的轻量化神经网络手势识别方法[J]. 强激光与粒子束, 2022, 34: 031023. doi: 10.11884/HPLPB202234.210335
Yang Chenyi, He Yuqing, Zhao Junyuan, et al. Lightweight neural network hand gesture recognition method for embedded platforms[J]. High Power Laser and Particle Beams, 2022, 34: 031023. doi: 10.11884/HPLPB202234.210335
Citation: Yang Chenyi, He Yuqing, Zhao Junyuan, et al. Lightweight neural network hand gesture recognition method for embedded platforms[J]. High Power Laser and Particle Beams, 2022, 34: 031023. doi: 10.11884/HPLPB202234.210335

面向嵌入式平台的轻量化神经网络手势识别方法

doi: 10.11884/HPLPB202234.210335
基金项目: 国家重点研发计划项目(2020YFF0304104)
详细信息
    作者简介:

    杨晨奕,3120190590@bit.edu.cn

    通讯作者:

    何玉青,yuqinghe@bit.edu.cn

  • 中图分类号: TP391

Lightweight neural network hand gesture recognition method for embedded platforms

  • 摘要: 针对传统基于图像分割和特征提取的手势识别算法在复杂背景下识别准确率低、灵活性差的问题,基于目标检测神经网络的手势识别算法可以有效提高复杂环境下手势识别的准确性。受嵌入式处理器体积和功耗的限制,常用的目标检测神经网络在嵌入式上的识别速度较低,不能满足实时手势识别的要求。在SSD目标检测的基础上对其进行优化,使用MobileNetv3网络实现特征提取,目标检测方面则是使用SSD-lite结构,其使用深度可分离卷积替代普通卷积,实现了轻量化MobileNetv3-SSDLite手势识别算法的设计。针对手势识别的要求,制作了包含不同手势的数据集,利用它在服务器上完成了模型的训练。为了满足嵌入式的算力限制,通过模型的量化压缩将float64的网络参数量化为int8,并压缩网络结构,提高网络在嵌入式上的推理速度,实现基于嵌入式的手势识别。实验结果表明,基于嵌入式的MobileNetv3-SSDLite手势识别算法可以达到平均准确率99.61%,且识别速度达到每秒50帧以上,满足实时手势识别的要求。
  • 图  1  算法搭建及工作流程框图

    Figure  1.  Construction and pipeline of the algorithm

    图  2  深度可分离卷积

    Figure  2.  Depthwise separable convolution

    图  3  压缩注意力机制[22]

    Figure  3.  Squeeze-and-excitation module [22]

    图  4  基于MobileNetv3-SSDLite的手势识别算法

    Figure  4.  Hand gesture recognition network based on MobileNetv3-SSDLite

    图  5  嵌入式优化前与优化后的神经网络结构图

    Figure  5.  Neural network structure before and after the embedded optimization

    图  6  选取的手势示意图

    Figure  6.  The chosen hand gestures

    图  7  手势训练集示意图

    Figure  7.  Selecting images for hand gesture dataset

    图  8  训练中的损失函数变化情况

    Figure  8.  Network loss in training process

    图  9  NVIDIA Jetson TX2嵌入式处理器开发者套件

    Figure  9.  NVIDIA Jetson TX2 embedded processor developer kit

    图  10  部分手势识别结果

    Figure  10.  Part of the hand gesture recognition results

    表  1  MobileNet系列与VGG16的对比

    Table  1.   MobileNet series comparison to VGG16

    network structureparams/MbyteMACs/106ImageNet
    accuracy/%
    VGG16 13.8 15300 71.5
    MobieNetv1 4.2 569 70.6
    MobileNetv2 3.4 300 72.0
    MobileNetv3 5.4 219 75.2
    下载: 导出CSV

    表  2  用于检测的额外特征图及其尺寸

    Table  2.   Extra feature map layers for object detection

    extra layersshape
    layer 1 $39 \times 39 \times 512$
    layer 2 $19 \times 19 \times 1024$
    layer 3 $10 \times 10 \times 512$
    layer 4 $5 \times 5 \times 256$
    layer 5 $3 \times 3 \times 256$
    layer 6 $1 \times 1 \times 256$
    下载: 导出CSV

    表  3  SSDLite深层检测网络与SSD的对比

    Table  3.   SSDLite detection head comparison to SSD

    network structureparams/MbyteMACs/106mAP/%
    SSD14.8125019.3
    SSDLite2.135022.2
    下载: 导出CSV

    表  4  不同手势的识别结果

    Table  4.   Recognition results of hand gestures

    hand gestureaccuracy/%
    0 99.64
    1 100.00
    3 99.51
    4 99.22
    5 99.69
    average 99.61
    下载: 导出CSV

    表  5  不同场景下手势识别结果

    Table  5.   Recognition results of hand gestureson various scenarios

    scenariosaverage accuracy/%
    multiple hand gestures96
    complicated background64
    low light intensity72
    下载: 导出CSV

    表  6  不同手势识别算法的比较

    Table  6.   Comparison of different hand gesture recognition algorithms.

    algorithmparams/MbyteMACs/106frame rate/(frame/s)mean accuracy/%
    VGG16-SSD24.330654291.75
    MobieNetv1-SSD7.212991293.98
    MobileNetv1-SSDLite4.111301693.86
    MobileNetv2-SSDLite3.16563691.01
    MobileNetv3-SSDLite2.25265899.61
    下载: 导出CSV
  • [1] 陈壮炼, 林晓乐, 王家伟, 等. 基于卷积神经网络的手势识别人机交互系统的设计[J]. 现代计算机, 2021(6):57-62. (Chen Zhuanglian, Lin Xiaole, Wang Jiawei, et al. Design of human-computer interaction system for gesture recognition based on convolutional neural network[J]. Modern Computer, 2021(6): 57-62 doi: 10.3969/j.issn.1007-1423.2021.06.011
    [2] 袁博, 查晨东. 手势识别技术发展现状与展望[J]. 科学技术创新, 2018(32):95-96. (Yuan Bo, Zha Chendong. Gesture recognition technology development status and outlook[J]. Scientific and Technological Innovation, 2018(32): 95-96 doi: 10.3969/j.issn.1673-1328.2018.32.056
    [3] 时梦丽, 张备伟, 刘光徽. 基于深度图像的实时手势识别方法[J]. 计算机工程与设计, 2020, 41(7):2057-2062. (Shi Mengli, Zhang Beiwei, Liu Guanghui. Real-time gesture recognition method based on depth image[J]. Computer Engineering and Design, 2020, 41(7): 2057-2062
    [4] 彭理仁, 王进, 林旭军, 等. 一种基于深度图像的静态手势神经网络识别方法[J]. 自动化与仪器仪表, 2020(1):6-9,15. (Peng Liren, Wang Jin, Lin Xujun, et al. A static gesture recognition method based on depth image and neural network[J]. Automation & Instrumentation, 2020(1): 6-9,15
    [5] 吴轶凡, 郭剑辉. 一种基于肤色模型的改进型手势分割算法的实现[J]. 电子设计工程, 2020, 28(18):185-188,193. (Wu Yifan, Guo Jianhui. Implementation of an improved gesture segmentation algorithm based on skin color model[J]. Electronic Design Engineering, 2020, 28(18): 185-188,193
    [6] Li Hui, Yang Lei, Wu Xiaoyu, et al. Static hand gesture recognition based on HOG with Kinect[C]//Proceedings of the 2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics. 2012: 271-273.
    [7] Liua C, Zhou Shuwang, Hu Sheng, et al. Hand gesture recognition based on sEMG signal and improved SVM voting method[C]//Proceedings of the 2020 IEEE 3rd International Conference on Information Systems and Computer Aided Education (ICISCAE). 2020: 605-608.
    [8] 石雨鑫, 邓洪敏, 郭伟林. 基于混合卷积神经网络的静态手势识别[J]. 计算机科学, 2019, 46(s1):165-168. (Shi Yuxin, Deng Hongmin, Guo Weilin. Static gesture recognition based on hybrid convolution neural network[J]. Computer Science, 2019, 46(s1): 165-168
    [9] Hussain S, Saxena R, Han Xie, et al. Hand gesture recognition using deep learning[C]//Proceedings of the 2017 International SoC Design Conference (ISOCC). 2017: 48-49.
    [10] 郭紫嫣, 韩慧妍, 何黎刚, 等. 基于改进的YOLOV4的手势识别算法及其应用[J]. 中北大学学报(自然科学版), 2021, 42(3):223-231. (Guo Ziyan, Han Huiyan, He Ligang, et al. Gesture recognition algorithm and application based on improved YOLOV4[J]. Journal of North University of China (Natural Science Edition), 2021, 42(3): 223-231
    [11] Chhajed R R, Parmar K P, Pandya M D, et al. Messaging and video calling application for specially abled people using hand gesture recognition[C]//Proceedings of the 2021 6th International Conference for Convergence in Technology (I2CT). 2021: 1-4.
    [12] Yi Chengming, Zhou Liguang, Wang Zhixiang, et al. Long-range hand gesture recognition with joint SSD network[C]//Proceedings of the 2018 IEEE International Conference on Robotics and Biomimetics (ROBIO). 2018: 1959-1963.
    [13] 孔维刚, 李文婧, 王秋艳, 等. 基于改进YOLOv4算法的轻量化网络设计与实现[J/OL]. 计算机工程, 1-10(2021-04-30)

    Kong Weigang, Li Wenjing, Wang Qiuyan, et al. Design and implementation of lightweight network based on YOLOv4 algorithm[J/OL]. Computer Engineering, 1-10(2021-04-30). https://doi.org/10.19678/j.issn.1000-3428.0060948
    [14] Liu Wei, Anguelov D, Erhan D, et al. SSD: single shot MultiBox detector[C]//Proceedings of the 14th European Conference on Computer Vision. 2016: 21-37.
    [15] Ren Shaoqing, He Kaiming, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. doi: 10.1109/TPAMI.2016.2577031
    [16] Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016: 779-788.
    [17] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[DB/OL]. arXiv preprint arXiv: 1409.1556, 2014.
    [18] Howard A, Sandler M, Chen Bo, et al. Searching for MobileNetV3[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 2019: 1314-1324.
    [19] Howard A G, Zhu Menglong, Chen Bo, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[DB/OL]. arXiv preprint arXiv: 1704.04861, 2017.
    [20] Sandler M, Howard A, Zhu Menglong, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018: 4510-4520.
    [21] Hu Jie, Shen Li, Albanie S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023. doi: 10.1109/TPAMI.2019.2913372
    [22] 杨国威, 许志旺, 房臣, 等. 融合剪枝与量化的目标检测网络压缩方法[J/OL]. 计算机工程与应用, 1-12[2021-12-17]

    Yang Guowei, Xu Zhiwang, Fang Chen, et al. Object detection network compression method based on pruning and quantization[J/OL]. Computer Engineering and Applications, 1-12[2021-12-17]. http://kns.cnki.net/kcms/detail/11.2127.tp.20210918.1121.008.html
  • 加载中
图(10) / 表(6)
计量
  • 文章访问数:  2398
  • HTML全文浏览量:  395
  • PDF下载量:  79
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-07-30
  • 修回日期:  2021-12-21
  • 网络出版日期:  2022-01-05
  • 刊出日期:  2022-01-13

目录

    /

    返回文章
    返回