«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j.cnki.ISSN1673-629X.2024.0402]
点击复制

基于改进PSENet的西夏文检测研究()

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:: 2025年05期

页码:: 16-22

栏目:: 媒体计算

出版日期:: 2025-05-10

文章信息/Info

Title:: Research on Xixia Script Detection Based on Improved PSENet

文章编号:: 1673-629X(2025)05-0016-07

作者:: 于海庆; 郑廷帅; 史伟*; 宁夏大学信息工程学院,宁夏银川 750021

Author(s):: YU Hai-qing; ZHENG Ting-shuai; SHI Wei*; School of Information Engineering,Ningxia University,Yinchuan 750021,China

关键词:: 文本检测; 多尺度特征; 特征融合; 自适应注意力; 西夏古籍

Keywords:: text detection; multi-scale features; feature fusion; adaptive attention; Xixia ancient texts

分类号:: TP391.1

DOI:: 10.20165/j.cnki.ISSN1673-629X.2024.0402

摘要:: 由于西夏文字形独特,结构复杂,笔画繁多,且西夏古籍存在缺字、狐斑、褪变等问题,现有的文字检测模型无法精确检测文字的位置。因此,在综合分析当前主流研究的基础上,该文提出了一种基于改进 PSENet 网络模型的西夏文检测方法。首先,通过 PSA 替代 ResNet 的 bottleneck 中的 3×3 卷积,组成了 EPSANet,其可以有效地提取更细粒度的多尺度空间信息;其次,提出一种自适应注意力模块(AAM)来减少特征图生成过程中的信息丢失;最后,引入了注意特征融合模块(AFF),更好地融合了具有不一致语义和尺度的特征。实验结果表明,在西夏文数据集文本检测任务中,对比标准的PSENet 模型,改进模型的精确率和 F1-score 分别提升了 3. 9 百分点和 3. 4 百分点。与其他主流模型相比较都有较明显的提升,证明了该方法的有效性。

Abstract:: Due to the unique shape, complex structure, and numerous strokes of Xixia characters, as well as issues such as missing characters,discoloration, and fading in ancient Xixia texts, existing text detection models cannot accurately locate the characters.Therefore,we propose a Xixia character detection method based on an improved PSENet network model,building on a comprehensive analysis of current mainstream research. Firstly,the proposed method replaces the 3×3 convolution in the bottleneck of ResNet with PSA,forming EPSANet,which effectively extracts finer-grained multi-scale spatial information. Secondly,an Adaptive Attention Module (AAM) is introduced to reduce information loss during the feature map generation process. Finally,an Attention Feature Fusion module (AFF) is incorporated to better fuse features with inconsistent semantics and scales. Experimental results show that in the text detection task on the Xixia character dataset, the precision and F1 - score of the improved model increased by 3. 9 percentage points and 3. 4 percentage points,respectively,compared to the standard PSENet model. Compared to other mainstream models,there are significant im-provements,demonstrating the effectiveness of the proposed method.

相似文献/References:

[1]许肖,顾磊. 复杂背景下文本检测研究[J].计算机技术与发展,2015,25(03):40.
　XU Xiao,GU Lei. Research on Text Detection under Complex Background[J].,2015,25(05):40.
[2]蒋志鹏,潘坤榕,张国林,等.基于置信度融合的自然场景文本检测方法[J].计算机技术与发展,2021,31(08):39.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 007]
　JIANG Zhi-peng,PAN Kun-rong,ZHANG Guo-lin,et al.Research on Scene Text Detection Based on Confidence Fusion[J].,2021,31(05):39.[doi:10. 3969 / j. issn. 1673-629X. 2021. 08. 007]
[3]朱鹏,陈虎*,李科,等.一种轻量级的多尺度特征人脸检测方法[J].计算机技术与发展,2020,30(04):1.[doi:10. 3969 / j. issn. 1673-629X. 2020. 04. 001]
　ZHU Peng,CHEN Hu*,LI Ke,et al.A Face Detection Method with Lightweight and Multi-scale Feature[J].,2020,30(05):1.[doi:10. 3969 / j. issn. 1673-629X. 2020. 04. 001]
[4]彭祥云,陈黎.安防视频时间戳同步检测方法研究[J].计算机技术与发展,2021,31(11):195.[doi:10. 3969 / j. issn. 1673-629X. 2021. 11. 032]
　PENG Xiang-yun,CHEN Li.Research on Synchronous Detection Method of Security Video Time Stamp[J].,2021,31(05):195.[doi:10. 3969 / j. issn. 1673-629X. 2021. 11. 032]
[5]殷齐,丁飞,朱跃,等.基于 CNN 与多尺度特征融合的城市交通流预测模型[J].计算机技术与发展,2022,32(10):175.[doi:10. 3969 / j. issn. 1673-629X. 2022. 10. 029]
　YIN Qi,DING Fei,ZHU Yue,et al.An Urban Traffic Flow Prediction Model Based on CNN and Multi-scale Feature Fusion[J].,2022,32(05):175.[doi:10. 3969 / j. issn. 1673-629X. 2022. 10. 029]
[6]赵晓芹.融合局部特征与全局特征的场景文本检测算法[J].计算机技术与发展,2022,32(S2):25.[doi:10. 3969 / j. issn. 1673-629X. 2022. S2. 004]
　ZHAO Xiao-qin.Scene Text Detection Algorithm Combining Local and Global Features[J].,2022,32(05):25.[doi:10. 3969 / j. issn. 1673-629X. 2022. S2. 004]
[7]方承志,倪梦媛,唐亮.基于残差网络及笔画宽度变换的场景文本检测[J].计算机技术与发展,2023,33(01):49.[doi:10. 3969 / j. issn. 1673-629X. 2023. 01. 008]
　FANG Cheng-zhi,NI Meng-yuan,TANG Liang.Scene Text Detection Based on Residual Network andStroke Width Transform[J].,2023,33(05):49.[doi:10. 3969 / j. issn. 1673-629X. 2023. 01. 008]
[8]丁泽,程艳云.面向自然街景改进的文本检测[J].计算机技术与发展,2023,33(04):82.[doi:10. 3969 / j. issn. 1673-629X. 2023. 04. 012]
　DING Ze,CHENG Yan-yun.Improved Text Detection for Natural Streetscape[J].,2023,33(05):82.[doi:10. 3969 / j. issn. 1673-629X. 2023. 04. 012]
[9]蔡俊敏,孙涵.基于注意力机制和多尺度特征的伪装目标检测[J].计算机技术与发展,2023,33(08):131.[doi:10. 3969 / j. issn. 1673-629X. 2023. 08. 019]
　CAI Jun-min,SUN Han.Camouflaged Object Detection Based on Attention Mechanism andMulti-scale Features[J].,2023,33(05):131.[doi:10. 3969 / j. issn. 1673-629X. 2023. 08. 019]
[10]郭锐,熊风光*,谢剑斌,等.基于改进残差池化层的纹理识别[J].计算机技术与发展,2023,33(09):37.[doi:10. 3969 / j. issn. 1673-629X. 2023. 09. 006]
　GUO Rui,XIONG Feng-guang*,XIE Jian-bin,et al.Texture Recognition Algorithm Based on Improved Deep Residual Pooling Layer[J].,2023,33(05):37.[doi:10. 3969 / j. issn. 1673-629X. 2023. 09. 006]

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed309
全文下载/Downloads261
评论/Comments