«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j. issn. 1673-629X. 2021. 11. 018]
点击复制

一种基于 Zynq 的 CNN 加速器设计与实现()

分享到：

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:: 31
期数:: 2021年11期

页码:: 108-113

栏目:: 系统工程

出版日期:: 2021-11-10

文章信息/Info

Title:: Design and Implementation of CNN Accelerator Based on Zynq

文章编号:: 1673-629X(2021)11-0108-06

作者:: 许杰; 张子恒; 王新宇; 佟诚; 梅青; 肖建*; 南京邮电大学电子与光学工程学院、微电子学院,江苏南京 210023

Author(s):: XU Jie; ZHANG Zi-heng; WANG Xin-yu; TONG Cheng; MEI Qing; XIAO Jian*; School of Electronic and Optical Engineering,School of Microelectronics,Nanjing University of Posts and Telecommunications,Nanjing 210023,China

关键词:: Zynq; 卷积神经网络; 硬件加速; 现场可编程逻辑门阵列; 数据量化; CIFAR-10

Keywords:: Zynq; convolutional neural network; hardware acceleration; FPGA; data quantification; CIFAR-10

分类号:: TP39

DOI:: 10. 3969 / j. issn. 1673-629X. 2021. 11. 018

摘要:: 卷积神经网络是一种前馈神经网络,它的人工神经元可以响应部分覆盖范围内的临近单元,对于大型图像处理有出色表现。文中设计了一种基于 Zynq 芯片的 CNN 加速器,以期在资源和功耗受限的 FPGA 中实现运算性能加速。该加速器采用数据量化的方式将网络参数从 64 位双精度浮点数转化为 16 位定点数;针对 CNN 不同层的特性和要求,设计了不同的网络结构和优化策略。卷积层和全连接层采用循环分块、循环流水及循环展开等方法进一步改进,而池化层采用流水线的优化方式。亦设计了 FPGA 和外部存储器的缓存策略,减少 FPGA 和外部存储器的数据传输量。以 CIFAR-10 数据集下的图像识别为例,在 Zynq7020 实验平台上进行板级测试,实验结果表明,100 MHz 的工作频率下,平均识别时间为15. 5 ms,相对于单核 CPU 方案实现了 144 倍的加速。

Abstract:: Convolutional neural network is a feed-forward neural network whose artificial neurons can respond to neighboring units within partial coverage and perform well in large-scale image processing. A CNN accelerator based on the Zynq chip is designed to accelerate the computing perform-ance in the FPGA with limited resources and power consumption. The accelerator uses data quantization to quantify network parameters from? 64-bit double-precision floating-point numbers to 16-bit fixed-point numbers. According to the characteristics and requirements of different layers of CNN,different network structures and optimization strategies are designed. The convolutional layer and the fully connected layer are further improved by the methods of loop tiling,loop pipeline and loop unrolling,and the pooling layer uses the pipeline optimization method. A cache strategy for FPGA and external memory is designed to reduce the amount of data transfer between FPGA and external memory. Taking image recognition under the CIFAR-10 data set as an example,a board-level test was performed on the Zynq7020 experimental platform.The experiment shows that the average recognition time is 15. 5 ms at a working frequency of 100 MHz,which is 144 times faster than the single-core CPU solution.

相似文献/References:

[1]崔凤焦.表情识别算法研究进展与性能比较[J].计算机技术与发展,2018,28(02):145.[doi:10．3969/j．issn．1673－629X．2018．02．031]
　CUI Feng-jiao.Ｒesearch and Performance Comparison of Facial Expression Ｒecognition Algorithm[J].,2018,28(11):145.[doi:10．3969/j．issn．1673－629X．2018．02．031]
[2]张丹丹,李雷. 基于PCANet-RF的人脸检测系统[J].计算机技术与发展,2016,26(02):31.
　ZHANG Dan-dan,LI Lei. Face Detection System Based on PCANet-RF[J].,2016,26(11):31.
[3]陈强锐,谢世朋.基于深度学习的肺部肿瘤检测方法[J].计算机技术与发展,2018,28(04):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
　CHEN Qiang-rui,XIE Shi-peng.Lung Cancer Detection Method Based on Deep Learning[J].,2018,28(11):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
[4]郭子琰,舒心,刘常燕,等.基于ReLU 函数的卷积神经网络的花卉识别算法[J].计算机技术与发展,2018,28(05):154.[doi:10．3969/j．issn．1673－629X．2018．05．035]
　GUO Ziyan,SHU Xin,LIU Changyan,et al.A Recognition Algorithm of Flower Based on Convolution Neural Network with ReLU Function[J].,2018,28(11):154.[doi:10．3969/j．issn．1673－629X．2018．05．035]
[5]缪宇杰,吴智钧,宫婧.基于3D 卷积的视频错帧筛选方法[J].计算机技术与发展,2018,28(05):179.[doi:10.3969/ j. issn.1673-629X.2018.05.040]
　MIAO Yu-jie,WU Zhi-jun,GONG Jing.A Wrong Temporal-order Frames Identification Method Based on 3D Convolution[J].,2018,28(11):179.[doi:10.3969/ j. issn.1673-629X.2018.05.040]
[6]吴玉枝,吴志红,熊运余.基于卷积神经网络的小样本车辆检测与识别[J].计算机技术与发展,2018,28(06):1.[doi:10.3969/ j. issn.1673-629X.2018.06.001]
　WU Yu-zhi,WU Zhi-hong,XIONG Yun-yu.Vehicle Detection and Recognition of a Few Samples Based on Convolutional Neural Network[J].,2018,28(11):1.[doi:10.3969/ j. issn.1673-629X.2018.06.001]
[7]李相桥,李晨,田丽华,等.卷积神经网络并行训练的优化研究[J].计算机技术与发展,2018,28(08):12.[doi:10.3969/ j. issn.1673-629X.2018.08.003]
　LI Xiang-qiao,LI Chen,TIAN Li-hua,et al.Research on Optimization of Parallel Training for Convolution Neural Network[J].,2018,28(11):12.[doi:10.3969/ j. issn.1673-629X.2018.08.003]
[8]邓宗平,赵启军,陈虎. 基于深度学习的人脸姿态分类方法[J].计算机技术与发展,2016,26(07):11.
　DEND Zong-ping,ZHAO Qi-jun,CHEN Hu. Face Pose Classification Method Based on Deep Learning[J].,2016,26(11):11.
[9]河海大学计算机与信息学院,江苏南京 0098.卷积网络的无监督特征提取对人脸识别的研究[J].计算机技术与发展,2018,28(06):17.[doi:10.3969/ j. issn.1673-629X.2018.06.004]
　DU Bai-sheng.Research on Unsupervised Feature Extraction Based on Convolutional Neural Network for Face Recognition[J].,2018,28(11):17.[doi:10.3969/ j. issn.1673-629X.2018.06.004]
[10]高翔,陈志,岳文静,等.基于视频场景深度学习的人物语义识别模型[J].计算机技术与发展,2018,28(06):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
　GAO Xiang,CHEN Zhi,YUE Wen-jing,et al.Human Semantic Recognition Model Based on Video Scene Deep Learning[J].,2018,28(11):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed1189
全文下载/Downloads625
评论/Comments