[1]产世兵,刘宁钟,沈家全.一种轻量级的不规则场景文本识别模型[J].计算机技术与发展,2020,30(11):20-24.[doi:10. 3969 / j. issn. 1673-629X. 2020. 11. 004]
 CHAN Shi-bing,LIU Ning-zhong,SHEN Jia-quan.A Lightweight Model for Irregular Scene Text Recognition[J].,2020,30(11):20-24.[doi:10. 3969 / j. issn. 1673-629X. 2020. 11. 004]
点击复制

一种轻量级的不规则场景文本识别模型()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
30
期数:
2020年11期
页码:
20-24
栏目:
智能、算法、系统工程
出版日期:
2020-11-10

文章信息/Info

Title:
A Lightweight Model for Irregular Scene Text Recognition
文章编号:
1673-629X(2020)11-0020-05
作者:
产世兵刘宁钟沈家全
南京航空航天大学 计算机科学与技术学院,江苏 南京 211106
Author(s):
CHAN Shi-bingLIU Ning-zhongSHEN Jia-quan
School of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China
关键词:
场景文本识别卷积神经网络轻量级网络循环神经网络空间变换网络
Keywords:
scene text recognition convolutional neural network lightweight network recurrent neural network spatial transformation network
分类号:
TP391. 4
DOI:
10. 3969 / j. issn. 1673-629X. 2020. 11. 004
摘要:
场景文本识别是近年来极具挑战性的任务,不同于规则的文档文本图像,场景图像中的文本具有形态多变和弯曲等特点,识别起来很有难度。 该文提出了一种轻量级的场景文本识别模型(ISTR-LW),不同于现有的场景文本识别模型具有参数量大的缺点,该模型在特征序列提取中引入了经过改变后的轻量级网络 PeleeNet,不仅大幅度减少了模型的参数量,还加快了网络预测的速度;在循环网络层中获取标签分布时,引入了 Dense Block 模块,加快了网络训练的收敛速度;在获取最终识别结果时,引入了注意力机制,获得需要关注的重点区域,提高了模型文本识别的准确度;引入了薄板样条插值转换,通过修正不规则的文本,改善了不规则的文本识别率低的问题。ISTR-LW 模型是一个端到端的文本识别模型,在 Synth90K、Street View Text 和 ICDAR 等公开数据集上进行了实验,取得了不错的效果。
Abstract:
Scene text recognition is a challenging task in recent years. Unlike regular document text image,the text in scene image has the characteristics of changeable shape and bending, so it? ?is quite difficult to recognize. A lightweight model for irregular scene text recognition (ISTR-LW) is proposed. Different from the existing scene text recognition model,which has a large number of parameters,we introduce the changed lightweight network PeleeNet into the feature sequence extraction of the model,which not only greatly reduces the number of parameters of the model,but also speeds up the network prediction. The Dense Block module is introduced to obtain the label distribution in the recurrent neural network,which greatly accelerates the convergence of the network. The attention mechanism is introduced to obtain the final recognition results, which improves the accuracy of model text recognition. The thin-plate spline transformation improves the low accuracy rate of irregular text by correcting irregular text. ISTR-LW model is an end-to-end text recognition model. Experiments are carried out on Synth90k,Street View Text,ICDAR and other public data sets to obtain better results.

相似文献/References:

[1]崔凤焦.表情识别算法研究进展与性能比较[J].计算机技术与发展,2018,28(02):145.[doi:10.3969/j.issn.1673-629X.2018.02.031]
 CUI Feng-jiao.Research and Performance Comparison of Facial Expression Recognition Algorithm[J].,2018,28(11):145.[doi:10.3969/j.issn.1673-629X.2018.02.031]
[2]张丹丹,李雷. 基于PCANet-RF的人脸检测系统[J].计算机技术与发展,2016,26(02):31.
 ZHANG Dan-dan,LI Lei. Face Detection System Based on PCANet-RF[J].,2016,26(11):31.
[3]陈强锐,谢世朋.基于深度学习的肺部肿瘤检测方法[J].计算机技术与发展,2018,28(04):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
 CHEN Qiang-rui,XIE Shi-peng.Lung Cancer Detection Method Based on Deep Learning[J].,2018,28(11):201.[doi:10.3969/ j. issn.1673-629X.2018.04.043]
[4]郭子琰,舒心,刘常燕,等.基于ReLU 函数的卷积神经网络的花卉识别算法[J].计算机技术与发展,2018,28(05):154.[doi:10.3969/j.issn.1673-629X.2018.05.035]
 GUO Ziyan,SHU Xin,LIU Changyan,et al.A Recognition Algorithm of Flower Based on Convolution Neural Network with ReLU Function[J].,2018,28(11):154.[doi:10.3969/j.issn.1673-629X.2018.05.035]
[5]缪宇杰,吴智钧,宫 婧.基于3D 卷积的视频错帧筛选方法[J].计算机技术与发展,2018,28(05):179.[doi:10.3969/ j. issn.1673-629X.2018.05.040]
 MIAO Yu-jie,WU Zhi-jun,GONG Jing.A Wrong Temporal-order Frames Identification Method Based on 3D Convolution[J].,2018,28(11):179.[doi:10.3969/ j. issn.1673-629X.2018.05.040]
[6]吴玉枝,吴志红,熊运余.基于卷积神经网络的小样本车辆检测与识别[J].计算机技术与发展,2018,28(06):1.[doi:10.3969/ j. issn.1673-629X.2018.06.001]
 WU Yu-zhi,WU Zhi-hong,XIONG Yun-yu.Vehicle Detection and Recognition of a Few Samples Based on Convolutional Neural Network[J].,2018,28(11):1.[doi:10.3969/ j. issn.1673-629X.2018.06.001]
[7]李相桥,李晨,田丽华,等.卷积神经网络并行训练的优化研究[J].计算机技术与发展,2018,28(08):12.[doi:10.3969/ j. issn.1673-629X.2018.08.003]
 LI Xiang-qiao,LI Chen,TIAN Li-hua,et al.Research on Optimization of Parallel Training for Convolution Neural Network[J].,2018,28(11):12.[doi:10.3969/ j. issn.1673-629X.2018.08.003]
[8]邓宗平,赵启军,陈虎. 基于深度学习的人脸姿态分类方法[J].计算机技术与发展,2016,26(07):11.
 DEND Zong-ping,ZHAO Qi-jun,CHEN Hu. Face Pose Classification Method Based on Deep Learning[J].,2016,26(11):11.
[9]河海大学 计算机与信息学院,江苏 南京 0098.卷积网络的无监督特征提取对人脸识别的研究[J].计算机技术与发展,2018,28(06):17.[doi:10.3969/ j. issn.1673-629X.2018.06.004]
 DU Bai-sheng.Research on Unsupervised Feature Extraction Based on Convolutional Neural Network for Face Recognition[J].,2018,28(11):17.[doi:10.3969/ j. issn.1673-629X.2018.06.004]
[10]高翔,陈志,岳文静,等.基于视频场景深度学习的人物语义识别模型[J].计算机技术与发展,2018,28(06):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]
 GAO Xiang,CHEN Zhi,YUE Wen-jing,et al.Human Semantic Recognition Model Based on Video Scene Deep Learning[J].,2018,28(11):53.[doi:10.3969/ j. issn.1673-629X.2018.06.012]

更新日期/Last Update: 2020-11-10