[1]王诚,赵申屹.一种改进的并行关联规则增量更新算法研究[J].计算机技术与发展,2018,28(07):48-52.[doi:10.3969/ j. issn.1673-629X.2018.07.011]
 WANG Cheng,ZHAO Shen-yi.Research on an Improved Incremental Updated Algorithm for Parallel Association Rule[J].,2018,28(07):48-52.[doi:10.3969/ j. issn.1673-629X.2018.07.011]
点击复制

一种改进的并行关联规则增量更新算法研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
28
期数:
2018年07期
页码:
48-52
栏目:
智能、算法、系统工程
出版日期:
2018-07-10

文章信息/Info

Title:
Research on an Improved Incremental Updated Algorithm for Parallel Association Rule
文章编号:
1673-629X(2018)07-0048-05
作者:
王诚赵申屹
南京邮电大学 通信与信息工程学院,江苏 南京 210003
Author(s):
WANG ChengZHAO Shen-yi
School of Telecommunications &Information Engineering,Nanjing University of Posts and Telecommunications,Nanjing 210003,China
关键词:
Spark关联规则增量更新并行计算FP-tree
Keywords:
Sparkassociation ruleincremental updatingparallel computingFP-tree
分类号:
TP311
DOI:
10.3969/ j. issn.1673-629X.2018.07.011
文献标志码:
A
摘要:
传统的基于频繁模式增长的并行关联规则算法在处理动态更新的数据集时,需要把更新后的数据集全部压缩到频繁模式树中,消耗了大量时间和存储空间,且没有充分考虑头表分组过程中组间负载量不同的问题。针对在关联规则 的实际挖掘过程中,数据集快速增长所造成的增量更新问题,基于并行频繁模式增长 PFP-tree 算法,结合 Spark 分布式并行处理框架,提出一种改进的并行关联规则增量更新算法。在增量更新过程中,为了减少挖掘时间和存储空间,利用已有挖掘结果对新增数据集构建频繁模式树。通过改进头表分组策略,实现了并行挖掘节点之间的负载均衡。 实验分析表明,相较于传统的关联增量更新算法,该算法是可行的且具备较高的挖掘效率和可扩展性,适用于动态增长的大数据环境。
Abstract:
Traditional parallel association rule algorithm based on frequent pattern growth has to compress the whole updated dataset into the frequent pattern tree when processing a dynamically updated dataset,expending much time and storage space. Moreover,it neglects the load-balancing problem during the grouping stage. Aimed at the incremental updating problem caused by the rapid increasing of data in actual association rules mining,we propose an improved incremental updated algorithm for parallel association rule based on parallel frequent pattern-tree algorithm and the Spark distributed processing framework. During the updating process,in order to reduce the mining time and storage space,existing mining results are used to construct frequent pattern trees for the adding datasets. The grouping strategy for header-table is improved to ensure load-balancing between the nodes. The experiment demonstrates that compared with the traditional associative incremental updating algorithm,the proposed algorithm is feasible with high mining efficiency and scalability and suitable for large data environment with dynamic growth.

相似文献/References:

[1]李雷 丁亚丽 罗红旗.基于规则约束制导的入侵检测研究[J].计算机技术与发展,2010,(03):143.
 LI Lei,DING Ya-li,LUO Hong-qi.Intrusion Detection Technology Research Based on Homing - Constraint Rule[J].,2010,(07):143.
[2]王爱平 王占凤 陶嗣干 燕飞飞.数据挖掘中常用关联规则挖掘算法[J].计算机技术与发展,2010,(04):105.
 WANG Ai-ping,WANG Zhan-feng,TAO Si-gan,et al.Common Algorithms of Association Rules Mining in Data Mining[J].,2010,(07):105.
[3]张广路 雷景生 吴兴惠.一种改进的Apriori关联规则挖掘算法(英文)[J].计算机技术与发展,2010,(06):84.
 ZHANG Guang-lu,LEI Jing-sheng,WU Xing-hui.An Improved Apriori Algorithm for Mining Association Rules[J].,2010,(07):84.
[4]耿波 仲红 徐杰 闫娜娜.用关联分析法对负荷预测结果进行二次处理[J].计算机技术与发展,2008,(04):171.
 GENG Bo,ZHONG Hong,XU Jie,et al.Using Correlation Analysis to Treat Load Forecasting Results[J].,2008,(07):171.
[5]文拯 梁建武 陈英.关联规则算法的研究[J].计算机技术与发展,2009,(05):56.
 WEN Zheng,LIANG Jian-wu,CHEN Ying.Research of Association Rules Algorithm[J].,2009,(07):56.
[6]王晓宇 秦锋 程泽凯 邹洪侠.关联规则挖掘技术的研究与应用[J].计算机技术与发展,2009,(05):220.
 WANG Xiao-yu,QIN Feng,CHENG Ze-kai,et al.Investigation and Application of Association Rules Mining[J].,2009,(07):220.
[7]陈伟.Apriori算法的优化方法[J].计算机技术与发展,2009,(06):80.
 CHEN Wei.Method of Apriori Algorithm Optimization[J].,2009,(07):80.
[8]吕刚[] 郑诚.基于本体的关联规则在电子商务中的应用[J].计算机技术与发展,2009,(06):250.
 LU Gang,ZHENG Cheng.Association Rules with Ontological Information in E- Commerce[J].,2009,(07):250.
[9]郑春香 韩承双.关联规则研究及在远程教育考试系统中的应用[J].计算机技术与发展,2009,(08):186.
 ZHENG Chun-xiang,HAN Cheng-shuang.Research on Association Rule Mining and Application of Long- Distance Education System[J].,2009,(07):186.
[10]郑春香 韩承双 董甲东.关联规则技术在教学评价中的应用[J].计算机技术与发展,2009,(09):215.
 ZHENG Chun-xiang,HAN Cheng-shuang,DONG Jia-dong.Application of Association Rule Mining in Teaching Appraisal[J].,2009,(07):215.
[11]许德心,李玲娟.基于 Spark 的关联规则挖掘算法并行化研究[J].计算机技术与发展,2019,29(03):30.[doi:10.3969/ j. issn.1673-629X.2019.03.006]
 XU De-xin,LI Ling-juan.Research on Parallelization of Association Rules Mining Algorithm Based on Spark[J].,2019,29(07):30.[doi:10.3969/ j. issn.1673-629X.2019.03.006]

更新日期/Last Update: 2018-08-27