[1]方承志,倪梦媛,唐 亮.基于残差网络及笔画宽度变换的场景文本检测[J].计算机技术与发展,2023,33(01):49-55.[doi:10. 3969 / j. issn. 1673-629X. 2023. 01. 008]
 FANG Cheng-zhi,NI Meng-yuan,TANG Liang.Scene Text Detection Based on Residual Network andStroke Width Transform[J].,2023,33(01):49-55.[doi:10. 3969 / j. issn. 1673-629X. 2023. 01. 008]
点击复制

基于残差网络及笔画宽度变换的场景文本检测()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
33
期数:
2023年01期
页码:
49-55
栏目:
媒体计算
出版日期:
2023-01-10

文章信息/Info

Title:
Scene Text Detection Based on Residual Network andStroke Width Transform
文章编号:
1673-629X(2023)01-0049-07
作者:
方承志倪梦媛唐 亮
南京邮电大学 电子与光学工程学院、微电子学院,江苏 南京 210023
Author(s):
FANG Cheng-zhiNI Meng-yuanTANG Liang
School of Electronic and Optical Engineering,School of Microelectronics,Nanjing University of Posts andTelecommunications,Nanjing 210023,China
关键词:
文本检测长文本残差结构损失函数笔画宽度变换
Keywords:
text detectionlong textresidual structureloss functionstroke width transform
分类号:
TP391
DOI:
10. 3969 / j. issn. 1673-629X. 2023. 01. 008
摘要:
针对目前自然场景中长文本检测效果不佳的问题,提出了一种基于残差网络及笔画宽度变换的自然场景文本检测算法。 在 EAST 算法的基础上进行了改进,引入了残差结构加深网络深度,扩大了感受野,避免了梯度消失的问题,提升了网络的学习能力;并在损失函数中加入了预测框与真实文本框的中心点间距离作为惩罚项,有效区分了不同重叠方式的检测框,进一步提高了检测精度。 同时在非极大值抑制阶段后增加了 SWT( Stroke Width Transform, 笔画宽度变换) 阶段,对预测文本框进行扩展,依据规则判定是否存在漏检字符,补全了缺失的长文本信息。 在 ICDAR2015 和 MSRA-TD500数据集上进行了测试,将 EAST 算法的 F 值分别提高了 3. 7 百分点和 4. 9 百分点。 表明该算法可以有效提高检测的准确度,并改善长文本的检测效果。
Abstract:
A natural scene text detection algorithm based on residual network and stroke width transform ( SWT) is proposed for poor detection of natural scene long text. EAST algorithm is improved by introducing residual structure to deepen the depth,enlarge receptivefield,avoid the problem of the gradient disappearance and improve the learning ability of the network. The distance between the centerpoint of the prediction box and the ground truth box is added into the loss function as the penalty term to effectively distinguish detectionboxes in different overlapping modes and further improve the detection accuracy. The SWT stage is added after the non-maximum suppression ( NMS) stage to expand the predicted text box and complete the missing long text information by judging whether there aremissing characters according to the rules. By testing on ICDAR2015 and MSRA - TD500 datasets, the F value of EAST algorithm is improved by 3. 7 and 4. 9 percentage. It is showed that the proposed algorithm can effectively improve the accuracy of the accuracy oftext detection and improve the detection effect of   long text.

相似文献/References:

[1]许肖,顾磊. 复杂背景下文本检测研究[J].计算机技术与发展,2015,25(03):40.
 XU Xiao,GU Lei. Research on Text Detection under Complex Background[J].,2015,25(01):40.
[2]蒋志鹏,潘坤榕,张国林,等.基于置信度融合的自然场景文本检测方法[J].计算机技术与发展,2021,31(08):39.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 007]
 JIANG Zhi-peng,PAN Kun-rong,ZHANG Guo-lin,et al.Research on Scene Text Detection Based on Confidence Fusion[J].,2021,31(01):39.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 007]
[3]彭祥云,陈 黎.安防视频时间戳同步检测方法研究[J].计算机技术与发展,2021,31(11):195.[doi:10. 3969 / j. issn. 1673-629X. 2021. 11. 032]
 PENG Xiang-yun,CHEN Li.Research on Synchronous Detection Method of Security Video Time Stamp[J].,2021,31(01):195.[doi:10. 3969 / j. issn. 1673-629X. 2021. 11. 032]
[4]赵晓芹.融合局部特征与全局特征的场景文本检测算法[J].计算机技术与发展,2022,32(S2):25.[doi:10. 3969 / j. issn. 1673-629X. 2022. S2. 004]
 ZHAO Xiao-qin.Scene Text Detection Algorithm Combining Local and Global Features[J].,2022,32(01):25.[doi:10. 3969 / j. issn. 1673-629X. 2022. S2. 004]
[5]丁 泽,程艳云.面向自然街景改进的文本检测[J].计算机技术与发展,2023,33(04):82.[doi:10. 3969 / j. issn. 1673-629X. 2023. 04. 012]
 DING Ze,CHENG Yan-yun.Improved Text Detection for Natural Streetscape[J].,2023,33(01):82.[doi:10. 3969 / j. issn. 1673-629X. 2023. 04. 012]
[6]关 慧,宗福焱,曲 盼.基于 BTM 和长文本语义增强的用户评论分类 …[J].计算机技术与发展,2023,33(07):181.[doi:10. 3969 / j. issn. 1673-629X. 2023. 07. 027]
 GUAN Hui,ZONG Fu-yan,QU Pan.User Comment Classification Based on BTM and Long Text Semantic Enhancement[J].,2023,33(01):181.[doi:10. 3969 / j. issn. 1673-629X. 2023. 07. 027]
[7]张庭瑞,方承志,徐国钦,等.基于多分支特征融合的自然场景文本检测算法[J].计算机技术与发展,2024,34(02):142.[doi:10. 3969 / j. issn. 1673-629X. 2024. 02. 021]
 ZHANG Ting-rui,FANG Cheng-zhi,XU Guo-qin,et al.Natural Scene Text Detection Algorithm Based on Multi-branch Feature Fusion[J].,2024,34(01):142.[doi:10. 3969 / j. issn. 1673-629X. 2024. 02. 021]

更新日期/Last Update: 2023-01-10