«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j. issn. 1673-629X. 2022. S2. 004]
点击复制

融合局部特征与全局特征的场景文本检测算法()

分享到：

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:: 32
期数:: 2022年S2期

页码:: 25-30

栏目:: 人工智能

出版日期:: 2022-12-11

文章信息/Info

Title:: Scene Text Detection Algorithm Combining Local and Global Features

文章编号:: 1673-629X(2022)S2-0025-06

作者:: 赵晓芹; 中国石油大学(华东) 计算机科学与技术学院,山东青岛 266580

Author(s):: ZHAO Xiao-qin; School of Computer Science & Technology,China University of Petroleum,Qingdao 266580,China

关键词:: 文本检测; 弱监督; 特征融合; ResNet; FPN

Keywords:: text detection; weakly supervised learning; features fusion; ResNet; FPN

分类号:: TP181

DOI:: 10. 3969 / j. issn. 1673-629X. 2022. S2. 004

摘要:: 检测复杂场景下的文本是一项极具挑战性的任务,现有的文本检测方法有将字符作为目标进行检测的,也有将单词作为目标进行检测的。对于单词内部排列较为松散的文本或字符之间间隔较小的文本,基于字符的检测算法容易将一个单词检测为多个单词,或将多个单词检测为一个单词。在这种情况下,基于单词的方法检测精度要更高一点,但是基于字符的方法比基于单词的方法更能准确的检测到文本中的每个符号。鉴于它们各自的优缺点,使用 ResNet 与 FPN 结合的网络,将这两种方法进行整合,充分利用文本的底层特征与高层特征。在检测单词的同时也检测单词中每个字符的信息,将这两种信息优化、融合,从而达到一种更好的检测效果。为了降低标注字符数据集的成本,在实验中加入弱监督的方法,使网络在只有单词标注的数据集上训练也能很好的检测字符。最后在 ICDAR 2013 数据集、ICDAR 2015 数据集和Total-Text 数据集上验证此方法的有效性。

Abstract:: Detecting text in complex scenes is a challenging task. Some methods use character as target for detection,and some use wordas target for detection. For words with loosely or tightly arranged between characters,character-based method easily detects one word asmultiple words,or multiple words as one word. In this case,word-based method has higher accuracy,but the character-based method candetect each character more accurately than the word-based method. In view of these advantages and disadvantages,ResNet+FPN networkis used to integrate these methods and make full use of the shallow and deep features of text. The network detects words and characters atthe same time,optimizes and merges these two kind of information to achieve better result. In order to reduce the cost of labelingcharacter data sets,weakly supervised learning is added to the experiment,so that the network can detect characters well when training onthe data sets with word annotations. Finally,the effectiveness of this method is verified on ICDAR 2013、ICDAR 2015 and Total-Textdata sets.

相似文献/References:

[1]许肖,顾磊. 复杂背景下文本检测研究[J].计算机技术与发展,2015,25(03):40.
　XU Xiao,GU Lei. Research on Text Detection under Complex Background[J].,2015,25(S2):40.
[2]彭昀磊,牛耘.基于弱监督的蛋白质交互识别[J].计算机技术与发展,2018,28(02):19.[doi:10．3969/j．issn．1673－629X．2018．02．005]
　PENG Yunlei,NIU Yun.Protein－protein Interaction Identification Based on Weak Supervision[J].,2018,28(S2):19.[doi:10．3969/j．issn．1673－629X．2018．02．005]
[3]蒋志鹏,潘坤榕,张国林,等.基于置信度融合的自然场景文本检测方法[J].计算机技术与发展,2021,31(08):39.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 007]
　JIANG Zhi-peng,PAN Kun-rong,ZHANG Guo-lin,et al.Research on Scene Text Detection Based on Confidence Fusion[J].,2021,31(S2):39.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 007]
[4]毛宇薇,牛耘.基于关键词的蛋白质交互关系识别[J].计算机技术与发展,2019,29(03):18.[doi:10.3969/ j. issn.1673-629X.2019.03.004]
　MAO Yu-wei,NIU Yun.Protein-protein Interaction Identification Based on Keywords[J].,2019,29(S2):18.[doi:10.3969/ j. issn.1673-629X.2019.03.004]
[5]张振宇,朱培栋,赵东升.一种用于病案相似性度量的弱监督学习算法[J].计算机技术与发展,2019,29(09):1.[doi:10. 3969 / j. issn. 1673-629X. 2019. 09. 001]
　ZHANG Zhen-yu,ZHU Pei-dong,ZHAO Dong-sheng.A Weakly Supervised Machine Learning Algorithm Applied to Similarity Measure of Medical Records[J].,2019,29(S2):1.[doi:10. 3969 / j. issn. 1673-629X. 2019. 09. 001]
[6]白瑜颖,刘宁钟,姜晓通.结合注意力混合裁剪的细粒度分类网络[J].计算机技术与发展,2021,31(10):38.[doi:10. 3969 / j. issn. 1673-629X. 2021. 10. 007]
　BAI Yu-ying,LIU Ning-zhong,JIANG Xiao-tong.Fine Grained Image Classification Network Combined with Attention CutMix[J].,2021,31(S2):38.[doi:10. 3969 / j. issn. 1673-629X. 2021. 10. 007]
[7]彭祥云,陈黎.安防视频时间戳同步检测方法研究[J].计算机技术与发展,2021,31(11):195.[doi:10. 3969 / j. issn. 1673-629X. 2021. 11. 032]
　PENG Xiang-yun,CHEN Li.Research on Synchronous Detection Method of Security Video Time Stamp[J].,2021,31(S2):195.[doi:10. 3969 / j. issn. 1673-629X. 2021. 11. 032]
[8]郎文溪,孙涵.基于视觉一致性增强的细粒度图像检索[J].计算机技术与发展,2022,32(12):12.[doi:10. 3969 / j. issn. 1673-629X. 2022. 12. 003]
　LANG Wen-xi,SUN Han.Fine-grained Image Retrieval Based on Strengthened Visual Consistency[J].,2022,32(S2):12.[doi:10. 3969 / j. issn. 1673-629X. 2022. 12. 003]
[9]方承志,倪梦媛,唐亮.基于残差网络及笔画宽度变换的场景文本检测[J].计算机技术与发展,2023,33(01):49.[doi:10. 3969 / j. issn. 1673-629X. 2023. 01. 008]
　FANG Cheng-zhi,NI Meng-yuan,TANG Liang.Scene Text Detection Based on Residual Network andStroke Width Transform[J].,2023,33(S2):49.[doi:10. 3969 / j. issn. 1673-629X. 2023. 01. 008]
[10]丁泽,程艳云.面向自然街景改进的文本检测[J].计算机技术与发展,2023,33(04):82.[doi:10. 3969 / j. issn. 1673-629X. 2023. 04. 012]
　DING Ze,CHENG Yan-yun.Improved Text Detection for Natural Streetscape[J].,2023,33(S2):82.[doi:10. 3969 / j. issn. 1673-629X. 2023. 04. 012]

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed624
全文下载/Downloads221
评论/Comments