[1]李朋飞,陈曙晖,徐成成. 一种正则表达式编译器优化技术[J].计算机技术与发展,2015,25(09):123-128.
 LI Peng-fei,CHEN Shu-hui,XU Cheng-cheng. A Regular Expression Compiler Improvement Technique[J].,2015,25(09):123-128.
点击复制

 一种正则表达式编译器优化技术()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
25
期数:
2015年09期
页码:
123-128
栏目:
安全与防范
出版日期:
2015-09-10

文章信息/Info

Title:
 A Regular Expression Compiler Improvement Technique
文章编号:
1673-629X(2015)09-0123-06
作者:
 李朋飞陈曙晖徐成成
 国防科学技术大学 计算机学院
Author(s):
 LI Peng-feiCHEN Shu-huiXU Cheng-cheng
关键词:
正则表达式子集法确定型有限自动机并行优化
Keywords:
 regular expressionsubset construction algorithmDFAparallel optimization
分类号:
TP393.08
文献标志码:
A
摘要:
 正则表达式匹配在网络安全领域具有重要地位。传统的正则表达式匹配引擎通常采用NFA和DFA,由于具有匹配性能高的特点,DFA成为深度报文检测( DPI)的首选。但是DFA的生成首先需要由正则表达式转换成NFA,再由NFA转换成DFA,这个过程称为正则表达式的编译,且是一个计算非常密集的行为。针对构建DFA过程中耗时过多的问题,在Michela Becchi实现的编译过程的基础上,提出了一种基于多核平台的多线程并行化执行的方案,来降低构建DFA消耗的时间。同时针对所使用正则表达式中不能识别尾锚的不足,增加尾锚处理流程,提高正则表达式匹配的准确性。实验结果表明,经并行优化,构建DFA过程的加速比达到2.3及以上,且添加的尾锚处理流程经验证是正确的。
Abstract:
 Regular expression matching plays an important role in network security domain. Traditionally,NFA and DFA could be used in regular expression matching. As DFA has the characteristic of high throughput,it becomes the preferred option of Deep Packet Inspection ( DPI) . However,DFA construction needs a compilation,which first converts regular expressions into NFA,and NFA is then used to con-struct DFA. The compilation process is a computation-intensive behavior. In this paper,the most time-consuming process of the construc-tion of DFA is researched. Based on the previous works of Michela Becchi,propose a multi-thread parallel strategy to reduce time-cost in the compilation process on the multi-core platform. In addition,the function of tail anchor is added and the accuracy is proved,so that the regular expression matching engine can deal with tail anchor. The experimental results show that the compiling process can be accelerated by 2. 3 times with parallel optimization.

相似文献/References:

[1]宋鑫坤 陈万米 朱明 桂春胜 程硕远 陈海波.基于正则表达式的语音识别控制策略研究[J].计算机技术与发展,2010,(02):106.
 SONG Xin-kun,CHEN Wan-mi,ZHU Ming,et al.Study on Speech Recognition Control Strategy Based on Regular Expression[J].,2010,(09):106.
[2]胡琼凯 黄建华.基于协议分析和决策树的入侵检测研究[J].计算机技术与发展,2009,(06):179.
 HU Oiong-kai,HUANG Jian-hua.Intrusion Detection Based on Protocol Analysis and Decision Tree[J].,2009,(09):179.
[3]佘石泉 周肆清.正则表达式在编程题自动阅卷中的应用[J].计算机技术与发展,2007,(07):244.
 SHE Shi-quan,ZHOU Si-qing.Application of Regex in Auto- Checking Paper of Programs[J].,2007,(09):244.
[4]姜林美 詹玲超 黄继风.利用隐藏域实现PHP页面的事件机制[J].计算机技术与发展,2006,(10):68.
 JIANG Lin-mei,ZHAN Ling-chao,HUANG Ji-feng.PHP Event Mechanism Based on Hidden Field[J].,2006,(09):68.
[5]成卫青,于静,杨晶,等.基于页面分类的 Web 信息抽取方法研究[J].计算机技术与发展,2013,(01):54.
 CHENG Wei-qing,YU Jing,YANG Jing,et al.Web Information Extraction Research Based on Page Classification[J].,2013,(09):54.
[6]唐惠丽,郑小妹.正则表达式的研究及在Web中的应用[J].计算机技术与发展,2013,(02):82.
 TANG Hui-li,ZHENG Xiao-mei.Research of Regular Expressions and Application in Web[J].,2013,(09):82.
[7]张英辉,张水平,张凤琴,等.基于OpenStreetMap最短路径算法的分析与实现[J].计算机技术与发展,2013,(11):37.
 ZHANG Ying-hui,ZHANG Shui-ping,ZHANG Feng-qin,et al.Analysis and Implementation of Shortest Path Algorithm Based on OpenStreetMap[J].,2013,(09):37.
[8]李欣,李绍稳,许高建,等.基于正则抽取的竹种数据结构化方法研究[J].计算机技术与发展,2018,28(06):147.[doi:10.3969/ j. issn.1673-629X.2018.06.033]
 LI Xin,LI Shao-wen,XU Gao-jian,et al. Research on a Data Structuralization Method of Bamboo Species Based on Regular Extraction Model[J].,2018,28(09):147.[doi:10.3969/ j. issn.1673-629X.2018.06.033]

更新日期/Last Update: 2015-10-16