[1]章 安,马明栋.基于 Tesseract 文字识别的预处理研究[J].计算机技术与发展,2021,31(01):73-76.[doi:10. 3969 / j. issn. 1673-629X. 2021. 01. 013]
 ZHANG An,MA Ming-dong.Research on Preprocessing Based on Tesseract Text Recognition[J].,2021,31(01):73-76.[doi:10. 3969 / j. issn. 1673-629X. 2021. 01. 013]
点击复制

基于 Tesseract 文字识别的预处理研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
31
期数:
2021年01期
页码:
73-76
栏目:
图形与图像
出版日期:
2021-01-10

文章信息/Info

Title:
Research on Preprocessing Based on Tesseract Text Recognition
文章编号:
1673-629X(2021)01-0073-04
作者:
章 安1马明栋2
1. 南京邮电大学 通信与信息工程学院,江苏 南京 210003; 2. 南京邮电大学 地理与生物信息学院,江苏 南京 210003
Author(s):
ZHANG An1MA Ming-dong2
1. School of Telecommunications & Information Engineering,Nanjing University of Posts and Telecommunications,Nanjing 210003, China; 2. School of Geographical and Biological Information,Nanjing University of Posts and Telecommunications,Nanjing 210003, China
关键词:
OCR文字识别预处理Tesseract 框架C++
Keywords:
OCRtext recognitionpreprocessingTesseract frameworkC++
分类号:
TP39
DOI:
10. 3969 / j. issn. 1673-629X. 2021. 01. 013
摘要:
针对 Tesseract 文字识别框架对输入图像的像素要求,以及图像采集过程中可能出现的歪斜、黑边等情况,基于文字识别流程,对预处理阶段的二值化、缩放、边框处理与倾斜矫正进行研究与 C++代码的实现。 对文字识别 OCR(optical character recognition,光学字符识别)的流程进行了概述,重点研究图像缩放与二值化过程,利用双线性插值算法逐像素、逐行分别对横纵坐标进行线性插值,完成图像缩放;利用最大类间方差法、聚类的思想,遍历灰度值,获取最佳二值化阈值,实现图像的二值化。 参考 OpenCV 库函数,提出图像边框与偏移的处理思路。 在 VS2015 环境下基于 Tesseract 框架,对整个流程进行实现,介绍了 Tesseract 框架的接口与功能、输入与输出参数。 图像的预处理对文字识别必不可少,有利于Tesseract 之后的识别工作。
Abstract:
According to the pixel requirements of the input image of the Tesseract text recognition framework, as well as the skew and black edges that may occur in the image acquisition process, based on the text recognition process, the binarization, scaling, border processing and tilt correction in the preprocess are researched and implemented in C++ code. The process of OCR (optical character recognition) is summarized, focusing on the process of image scaling and binarization. The bilinear interpolation algorithm is used to linearly interpolate the horizontal and vertical coordinates pixel by pixel and line by line so as to complete image scaling. According to idea of maximum inter-class variance method and clustering,the gray value is traversed to obtain the optimal binarization threshold to achieve the binarization of the image. With reference to the OpenCV library function,the image frame and offset processing ideas are proposed. Based on the Tesseract framework in VS2015,the entire process is implemented,and the interfaces and functions of the Tesseract framework,input and output parameters are introduced. Image preprocessing is essential for text recognition,which is beneficial to the recognition work after Tesseract.

相似文献/References:

[1]陈梓洋,王宇飞,钱侃,等. 自然场景下基于区域检测的文字识别算法[J].计算机技术与发展,2015,25(07):230.
 CHEN Zi-yang,WANG Yu-fei,QIAN Kan,et al. Character Recognition Algorithm Based on Region Detection in Natural Scene[J].,2015,25(01):230.
[2]任荣梓,高航. 基于反馈合并的中英文混排版面OCR技术研究[J].计算机技术与发展,2017,27(03):39.
 REN Rong-zi,GAO Hang. Investigation on Layout Analysis Technology of Chinese and English Mixed OCR Based on Feedback Merging[J].,2017,27(01):39.
[3]曾 悦,马明栋.基于 Tesseract_OCR 文字识别的研究[J].计算机技术与发展,2021,31(11):76.[doi:10. 3969 / j. issn. 1673-629X. 2021. 11. 013]
 ZENG Yue,MA Ming-dong.Research on Text Recognition Based on Tesseract_OCR[J].,2021,31(01):76.[doi:10. 3969 / j. issn. 1673-629X. 2021. 11. 013]
[4]蒋子敏,刘宁钟,沈家全.基于轻量级网络的 PCB 芯片文字识别[J].计算机技术与发展,2021,31(12):55.[doi:10. 3969 / j. issn. 1673-629X. 2021. 12. 010]
 JIANG Zi-min,LIU Ning-zhong,SHEN Jia-quan.PCB Chip Text Recognition Based on Lightweight Network[J].,2021,31(01):55.[doi:10. 3969 / j. issn. 1673-629X. 2021. 12. 010]
[5]童 攀,龙炳鑫,拥 措 *.基于注意力机制藏文乌金体古籍文字识别研究[J].计算机技术与发展,2023,33(10):163.[doi:10. 3969 / j. issn. 1673-629X. 2023. 10. 025]
 TONG Pan,LONG Bing-xin,YONG Cuo*.Research on Tibetan Ujin Ancient Book Character Recognition Based on Attention Mechanism[J].,2023,33(01):163.[doi:10. 3969 / j. issn. 1673-629X. 2023. 10. 025]
[6]张婷婷,马明栋,王得玉.OCR 文字识别技术的研究[J].计算机技术与发展,2020,30(04):85.[doi:10. 3969 / j. issn. 1673-629X. 2020. 04. 016]
 ZHANG Ting-ting,MA Ming-dong,WANG De-yu.Research on OCR Technology[J].,2020,30(01):85.[doi:10. 3969 / j. issn. 1673-629X. 2020. 04. 016]

更新日期/Last Update: 2020-01-10