«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j. issn. 1673-629X. 2023. 04. 012]
点击复制

面向自然街景改进的文本检测()

分享到：

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:: 33
期数:: 2023年04期

页码:: 82-88

栏目:: 媒体计算

出版日期:: 2023-04-10

文章信息/Info

Title:: Improved Text Detection for Natural Streetscape

文章编号:: 1673-629X(2023)04-0082-07

作者:: 丁泽; 程艳云; 南京邮电大学自动化学院、人工智能学院,江苏南京 210023

Author(s):: DING Ze; CHENG Yan-yun; School of Automation,School of Artificial Intelligence,Nanjing University of Posts and Telecommunications, Nanjing 210023,China

关键词:: 文本检测; 特征金字塔; 极化自注意力; RFB 模块; 条形池化模块

Keywords:: text detection; feature pyramid; Polarized Self-Attention ( PSA) ; RFB module; strip pooling module

分类号:: TP391. 4

DOI:: 10. 3969 / j. issn. 1673-629X. 2023. 04. 012

摘要:: 近年来,随着深度学习的发展,在自然街景下的文本检测取得了巨大的进步,但在多方向和弯曲文本及对比度低的文本检测中的效果仍不理想。因此,针对弯曲文本和对比度低的文本的检测问题,提出了一种融合多尺度模块的文本检测方法,并通过检测效果的提升,提高端到端文本识别的识别效果。针对 RFB( Receptive Field Block) 模块在下采样后局部信息丢失的问题,在 RFB 模块中嵌入极化自注意力( Polarized Self-Attention) 机制以改进 RFB 来提取有效文本特征,提高特征图表征效果。针对特征金字塔( FPN) 提取的特征不足、感受野小的问题,将改进的 RFB 模块嵌入特征金字塔(FPN) 模块以增强特征提取融合。针对特征分布不确定性及远距离特征融合效果不佳的问题, 引入条形池化( StripPooling) 模块,进而提升检测方法的鲁棒性。在公开数据集 Total-Text 上的实验结果表明,该算法的 F-measure 值在端到端文本识别没有词汇表的情形下与目前高效的 MaskTextSpotterV3 相比高了 0. 3 百分点,而在有词汇表的情形下则高出了0. 2 百分点;而在仅文本检测的情形下,该方法也有较为良好的表现。

Abstract:: In recent years,with the development of deep learning,great progress has been made in text detection?
under natural streetscape,but the effect in multi-directional and curved text and text with low contrast is?
still unsatisfactory. Therefore,we propose a text detectionmethod incorporating a multi-scale module for?
the detection of curved text and text with low contrast,and improve the recognition effectof end-to-end text recognition through the improvement of detection effect. To address the problem of local information loss in?
the RFB( Receptive Field Block) module after downsampling,a Polarized Self-Attention ( PSA) mechanism is embedded in the RFB module toimprove the RFB to extract effective text features and improve the feature?
map representation. To address the problem of insufficientfeatures and small perceptual fields extracted by?
the feature pyramid ( FPN) , the improved RFB module is embedded in the featurepyramid ( FPN) module to enhance feature extraction fusion. To address the problems of uncertain feature distribution and poor fusion oflong-range features,a Strip Pooling module is introduced to improve the robustness of the detection method. Experimental results on thepublicly available dataset Total-Text show that the F-measure value of the proposed algorithm is 0. 3% higher than that of the current efficient MaskTextSpotterV3 in the case of end-to-end text recognition without vocabularies,and 0. 2% higher in the case of vocabularies.In the case of text-only detection,
it also has a better performance.

相似文献/References:

[1]许肖,顾磊. 复杂背景下文本检测研究[J].计算机技术与发展,2015,25(03):40.
　XU Xiao,GU Lei. Research on Text Detection under Complex Background[J].,2015,25(04):40.
[2]蒋志鹏,潘坤榕,张国林,等.基于置信度融合的自然场景文本检测方法[J].计算机技术与发展,2021,31(08):39.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 007]
　JIANG Zhi-peng,PAN Kun-rong,ZHANG Guo-lin,et al.Research on Scene Text Detection Based on Confidence Fusion[J].,2021,31(04):39.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 007]
[3]刘霞.基于 SE Detection Net 的安全帽检测方法[J].计算机技术与发展,2020,30(06):156.[doi:10. 3969 / j. issn. 1673-629X. 2020. 06. 030]
　LIU Xia.Safety Helmet Detection Method Based on SE Detection Net[J].,2020,30(04):156.[doi:10. 3969 / j. issn. 1673-629X. 2020. 06. 030]
[4]温志强,冯向萍,徐静.基于 Mask R-CNN 的马匹面部别征识别及分割方法[J].计算机技术与发展,2021,31(06):209.[doi:10. 3969 / j. issn. 1673-629X. 2021. 06. 037]
　WEN Zhi-qiang,FENG Xiang-ping,XU Jing.Recognition and Segmentation of Horse Facial FeaturesBased on Mask R-CNN[J].,2021,31(04):209.[doi:10. 3969 / j. issn. 1673-629X. 2021. 06. 037]
[5]彭祥云,陈黎.安防视频时间戳同步检测方法研究[J].计算机技术与发展,2021,31(11):195.[doi:10. 3969 / j. issn. 1673-629X. 2021. 11. 032]
　PENG Xiang-yun,CHEN Li.Research on Synchronous Detection Method of Security Video Time Stamp[J].,2021,31(04):195.[doi:10. 3969 / j. issn. 1673-629X. 2021. 11. 032]
[6]凌永标,毛峰,杨岚岚,等.基于混合注意力网络的安全工器具检测[J].计算机技术与发展,2022,32(06):209.[doi:10. 3969 / j. issn. 1673-629X. 2022. 06. 035]
　LING Yong-biao,MAO Feng,YANG Lan-lan*,et al.Safety Tools Detection Based on Hybrid Attention Network[J].,2022,32(04):209.[doi:10. 3969 / j. issn. 1673-629X. 2022. 06. 035]
[7]张永福,宋海林.基于跳跃特征金字塔的域适应目标检测模型[J].计算机技术与发展,2022,32(09):28.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 005]
　ZHANG Yong-fu,SONG Hai-lin.Skip Feature Pyramid Based Domain Adapted Model for Object Detection[J].,2022,32(04):28.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 005]
[8]姚捃,郭志林.一种端到端的考场多目标行为识别算法[J].计算机技术与发展,2022,32(09):174.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 027]
　YAO Jun,GUO Zhi-lin.An End-to-end Multi-objective Behavior Recognition Algorithm for Examination Room[J].,2022,32(04):174.[doi:10. 3969 / j. issn. 1673-629X. 2022. 09. 027]
[9]赵晓芹.融合局部特征与全局特征的场景文本检测算法[J].计算机技术与发展,2022,32(S2):25.[doi:10. 3969 / j. issn. 1673-629X. 2022. S2. 004]
　ZHAO Xiao-qin.Scene Text Detection Algorithm Combining Local and Global Features[J].,2022,32(04):25.[doi:10. 3969 / j. issn. 1673-629X. 2022. S2. 004]
[10]张伟,刘宁钟,寇金桥.基于深度特征金字塔的路面病害检测[J].计算机技术与发展,2022,32(12):173.[doi:10. 3969 / j. issn. 1673-629X. 2022. 12. 026]
　ZHANG Wei,LIU Ning-zhong,KOU Jin-qiao.Pavement Disease Detection Based on Depth Feature Pyramids[J].,2022,32(04):173.[doi:10. 3969 / j. issn. 1673-629X. 2022. 12. 026]

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed605
全文下载/Downloads315
评论/Comments