面向嵌入式平台的轻量化神经网络手势识别方法

杨晨奕; 何玉青; 赵俊媛; 李国荣

doi:10.11884/HPLPB202234.210335

面向嵌入式平台的轻量化神经网络手势识别方法

doi: 10.11884/HPLPB202234.210335

北京理工大学光电学院，光电成像技术与系统教育部重点实验室，北京 100081

基金项目: 国家重点研发计划项目（2020YFF0304104）

详细信息

作者简介:
杨晨奕，3120190590@bit.edu.cn

通讯作者:
何玉青，yuqinghe@bit.edu.cn

中图分类号: TP391
计量
- 文章访问数: 2398
- HTML全文浏览量: 395
- PDF下载量: 79
- 被引次数: 0
出版历程
- 收稿日期: 2021-07-30
- 修回日期: 2021-12-21
- 网络出版日期: 2022-01-05
- 刊出日期: 2022-01-13

Lightweight neural network hand gesture recognition method for embedded platforms

Key Laboratory of Photoelectronic Imaging Technology and System of Ministry of Education, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China

摘要

摘要: 针对传统基于图像分割和特征提取的手势识别算法在复杂背景下识别准确率低、灵活性差的问题，基于目标检测神经网络的手势识别算法可以有效提高复杂环境下手势识别的准确性。受嵌入式处理器体积和功耗的限制，常用的目标检测神经网络在嵌入式上的识别速度较低，不能满足实时手势识别的要求。在SSD目标检测的基础上对其进行优化，使用MobileNetv3网络实现特征提取，目标检测方面则是使用SSD-lite结构，其使用深度可分离卷积替代普通卷积，实现了轻量化MobileNetv3-SSDLite手势识别算法的设计。针对手势识别的要求，制作了包含不同手势的数据集，利用它在服务器上完成了模型的训练。为了满足嵌入式的算力限制，通过模型的量化压缩将float64的网络参数量化为int8，并压缩网络结构，提高网络在嵌入式上的推理速度，实现基于嵌入式的手势识别。实验结果表明，基于嵌入式的MobileNetv3-SSDLite手势识别算法可以达到平均准确率99.61%，且识别速度达到每秒50帧以上，满足实时手势识别的要求。
- 手势识别 /
- 深度神经网络 /
- 嵌入式 /
- 轻量化 /
- MobileNev3-SSDLite
Abstract: Compared with the traditional gesture recognition algorithms based on image segmentation and feature extraction in complex backgrounds which have low recognition accuracy and poor flexibility, the gesture recognition algorithm based on target detection neural network can effectively improve the accuracy of gesture recognition in complex environments. Restricted by the size and power consumption of embedded processors, the recognition speed of commonly used target detection neural networks on embedded processors is low and cannot meet the requirements of real-time gesture recognition. In this paper, we optimize the SSD target detection and use MobileNetv3 network to achieve feature extraction and SSD-lite structure for target detection, thus to use depth-separable convolution instead of ordinary convolution to realize the design of lightweight MobileNetv3-SSDLite gesture recognition algorithm. For the requirements of gesture recognition, we make a dataset containing different gestures and complete the training of the model on the server using the dataset. In order to meet the arithmetic limitation of embedded processor, we quantize the float64 network parameters into int8 by quantization compression of the model, and compress the network structure to improve the inference speed of the network on embedded processor to realize the embedded-based gesture recognition. The experimental results show that the embedded-based MobileNetv3-SSDLite gesture recognition algorithm can achieve an average accuracy of 99.61% and a recognition speed of above 50 frame/s, which meets the requirements of real-time gesture recognition.
- hand gesture recognition /
- deep neuron network /
- embedded system /
- lightweight /
- MobileNetv3-SSDLite

HTML全文

图 1 算法搭建及工作流程框图

Figure 1. Construction and pipeline of the algorithm

下载: 全尺寸图片幻灯片

图 2 深度可分离卷积

Figure 2. Depthwise separable convolution

下载: 全尺寸图片幻灯片

图 3 压缩注意力机制^[22]

Figure 3. Squeeze-and-excitation module^[22]

下载: 全尺寸图片幻灯片

图 4 基于MobileNetv3-SSDLite的手势识别算法

Figure 4. Hand gesture recognition network based on MobileNetv3-SSDLite

下载: 全尺寸图片幻灯片

图 5 嵌入式优化前与优化后的神经网络结构图

Figure 5. Neural network structure before and after the embedded optimization

下载: 全尺寸图片幻灯片

图 6 选取的手势示意图

Figure 6. The chosen hand gestures

下载: 全尺寸图片幻灯片

图 7 手势训练集示意图

Figure 7. Selecting images for hand gesture dataset

下载: 全尺寸图片幻灯片

图 8 训练中的损失函数变化情况

Figure 8. Network loss in training process

下载: 全尺寸图片幻灯片

图 9 NVIDIA Jetson TX2嵌入式处理器开发者套件

Figure 9. NVIDIA Jetson TX2 embedded processor developer kit

下载: 全尺寸图片幻灯片

图 10 部分手势识别结果

Figure 10. Part of the hand gesture recognition results

下载: 全尺寸图片幻灯片

表 1 MobileNet系列与VGG16的对比

Table 1. MobileNet series comparison to VGG16

network structure	params/Mbyte	MACs/10⁶	ImageNet accuracy/%
VGG16	13.8	15300	71.5
MobieNetv1	4.2	569	70.6
MobileNetv2	3.4	300	72.0
MobileNetv3	5.4	219	75.2

下载: 导出CSV

表 2 用于检测的额外特征图及其尺寸

Table 2. Extra feature map layers for object detection

extra layers	shape
layer 1	$39 \times 39 \times 512$
layer 2	$19 \times 19 \times 1024$
layer 3	$10 \times 10 \times 512$
layer 4	$5 \times 5 \times 256$
layer 5	$3 \times 3 \times 256$
layer 6	$1 \times 1 \times 256$

下载: 导出CSV

表 3 SSDLite深层检测网络与SSD的对比

Table 3. SSDLite detection head comparison to SSD

network structure	params/Mbyte	MACs/10⁶	mAP/%
SSD	14.8	1250	19.3
SSDLite	2.1	350	22.2

下载: 导出CSV

表 4 不同手势的识别结果

Table 4. Recognition results of hand gestures

hand gesture	accuracy/%
0	99.64
1	100.00
3	99.51
4	99.22
5	99.69
average	99.61

下载: 导出CSV

表 5 不同场景下手势识别结果

Table 5. Recognition results of hand gestureson various scenarios

scenarios	average accuracy/%
multiple hand gestures	96
complicated background	64
low light intensity	72

下载: 导出CSV

表 6 不同手势识别算法的比较

Table 6. Comparison of different hand gesture recognition algorithms.

algorithm	params/Mbyte	MACs/10⁶	frame rate/(frame/s)	mean accuracy/%
VGG16-SSD	24.3	30654	2	91.75
MobieNetv1-SSD	7.2	1299	12	93.98
MobileNetv1-SSDLite	4.1	1130	16	93.86
MobileNetv2-SSDLite	3.1	656	36	91.01
MobileNetv3-SSDLite	2.2	526	58	99.61

下载: 导出CSV

参考文献(22)

[1]	陈壮炼, 林晓乐, 王家伟, 等. 基于卷积神经网络的手势识别人机交互系统的设计[J]. 现代计算机, 2021(6):57-62. (Chen Zhuanglian, Lin Xiaole, Wang Jiawei, et al. Design of human-computer interaction system for gesture recognition based on convolutional neural network[J]. Modern Computer, 2021(6): 57-62 doi: 10.3969/j.issn.1007-1423.2021.06.011
[2]	袁博, 查晨东. 手势识别技术发展现状与展望[J]. 科学技术创新, 2018(32):95-96. (Yuan Bo, Zha Chendong. Gesture recognition technology development status and outlook[J]. Scientific and Technological Innovation, 2018(32): 95-96 doi: 10.3969/j.issn.1673-1328.2018.32.056
[3]	时梦丽, 张备伟, 刘光徽. 基于深度图像的实时手势识别方法[J]. 计算机工程与设计, 2020, 41(7):2057-2062. (Shi Mengli, Zhang Beiwei, Liu Guanghui. Real-time gesture recognition method based on depth image[J]. Computer Engineering and Design, 2020, 41(7): 2057-2062
[4]	彭理仁, 王进, 林旭军, 等. 一种基于深度图像的静态手势神经网络识别方法[J]. 自动化与仪器仪表, 2020(1):6-9,15. (Peng Liren, Wang Jin, Lin Xujun, et al. A static gesture recognition method based on depth image and neural network[J]. Automation & Instrumentation, 2020(1): 6-9,15
[5]	吴轶凡, 郭剑辉. 一种基于肤色模型的改进型手势分割算法的实现[J]. 电子设计工程, 2020, 28(18):185-188,193. (Wu Yifan, Guo Jianhui. Implementation of an improved gesture segmentation algorithm based on skin color model[J]. Electronic Design Engineering, 2020, 28(18): 185-188,193
[6]	Li Hui, Yang Lei, Wu Xiaoyu, et al. Static hand gesture recognition based on HOG with Kinect[C]//Proceedings of the 2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics. 2012: 271-273.
[7]	Liua C, Zhou Shuwang, Hu Sheng, et al. Hand gesture recognition based on sEMG signal and improved SVM voting method[C]//Proceedings of the 2020 IEEE 3rd International Conference on Information Systems and Computer Aided Education (ICISCAE). 2020: 605-608.
[8]	石雨鑫, 邓洪敏, 郭伟林. 基于混合卷积神经网络的静态手势识别[J]. 计算机科学, 2019, 46(s1):165-168. (Shi Yuxin, Deng Hongmin, Guo Weilin. Static gesture recognition based on hybrid convolution neural network[J]. Computer Science, 2019, 46(s1): 165-168
[9]	Hussain S, Saxena R, Han Xie, et al. Hand gesture recognition using deep learning[C]//Proceedings of the 2017 International SoC Design Conference (ISOCC). 2017: 48-49.
[10]	郭紫嫣, 韩慧妍, 何黎刚, 等. 基于改进的YOLOV4的手势识别算法及其应用[J]. 中北大学学报(自然科学版), 2021, 42(3):223-231. (Guo Ziyan, Han Huiyan, He Ligang, et al. Gesture recognition algorithm and application based on improved YOLOV4[J]. Journal of North University of China (Natural Science Edition), 2021, 42(3): 223-231
[11]	Chhajed R R, Parmar K P, Pandya M D, et al. Messaging and video calling application for specially abled people using hand gesture recognition[C]//Proceedings of the 2021 6th International Conference for Convergence in Technology (I2CT). 2021: 1-4.
[12]	Yi Chengming, Zhou Liguang, Wang Zhixiang, et al. Long-range hand gesture recognition with joint SSD network[C]//Proceedings of the 2018 IEEE International Conference on Robotics and Biomimetics (ROBIO). 2018: 1959-1963.
[13]	孔维刚, 李文婧, 王秋艳, 等. 基于改进YOLOv4算法的轻量化网络设计与实现[J/OL]. 计算机工程, 1-10(2021-04-30) Kong Weigang, Li Wenjing, Wang Qiuyan, et al. Design and implementation of lightweight network based on YOLOv4 algorithm[J/OL]. Computer Engineering, 1-10(2021-04-30). https://doi.org/10.19678/j.issn.1000-3428.0060948
[14]	Liu Wei, Anguelov D, Erhan D, et al. SSD: single shot MultiBox detector[C]//Proceedings of the 14th European Conference on Computer Vision. 2016: 21-37.
[15]	Ren Shaoqing, He Kaiming, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. doi: 10.1109/TPAMI.2016.2577031
[16]	Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016: 779-788.
[17]	Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[DB/OL]. arXiv preprint arXiv: 1409.1556, 2014.
[18]	Howard A, Sandler M, Chen Bo, et al. Searching for MobileNetV3[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 2019: 1314-1324.
[19]	Howard A G, Zhu Menglong, Chen Bo, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[DB/OL]. arXiv preprint arXiv: 1704.04861, 2017.
[20]	Sandler M, Howard A, Zhu Menglong, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018: 4510-4520.
[21]	Hu Jie, Shen Li, Albanie S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023. doi: 10.1109/TPAMI.2019.2913372
[22]	杨国威, 许志旺, 房臣, 等. 融合剪枝与量化的目标检测网络压缩方法[J/OL]. 计算机工程与应用, 1-12[2021-12-17] Yang Guowei, Xu Zhiwang, Fang Chen, et al. Object detection network compression method based on pruning and quantization[J/OL]. Computer Engineering and Applications, 1-12[2021-12-17]. http://kns.cnki.net/kcms/detail/11.2127.tp.20210918.1121.008.html