«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j. issn. 1673-629X. 2022. 01. 009]
点击复制

基于二进制编码的 Apriori 增量更新算法研究()

分享到：

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:: 32
期数:: 2022年01期

页码:: 47-53

栏目:: 大数据分析与挖掘

出版日期:: 2022-01-10

文章信息/Info

Title:: Research on Apriori Incremental Update Improved AlgorithmBased on Binary Code

文章编号:: 1673-629X(2022)01-0047-07

作者:: 罗章铭; 唐杰; 黄逸奇; 张锦; 湖南师范大学信息科学与工程学院,湖南长沙 410006

Author(s):: LUO Zhang-ming; TANG Jie; HUANG Yi-qi; ZHANG Jin; School of Information Science and Engineering,Hunan Normal University,Changsha 410006,China

关键词:: 数据挖掘; Apriori 算法; 关联规则; 二进制; 增量更新

Keywords:: data mining; Apriori algorithm; association rules; binary; incremental update

分类号:: TP301. 6

DOI:: 10. 3969 / j. issn. 1673-629X. 2022. 01. 009

摘要:: 针对经典 Apriori 算法在迭代过程中频繁扫描数据库,且动态数据更新后需要重新处理数据的不足,提出一种基于二进制编码的增量更新改进 CBEF-Apriori 算法。该算法的核心思想是将添加增量后的项集、事务转换成二进制编码,从而将计算项集支持度转化为项集与事务数据库的二进制编码位运算过程。改进算? 法筛选原数据库生成的频繁项集与增量数据库新生成的候选项集,有效减少了候选项集的规模,提高算法效率的同时更符合现实需要。实验结果表明,相比于经典 Apriori 算法和 CBE-Apriori 算法,改进算法在挖掘出正确频繁项集的数量不降低的情况下,明显提升了计算效率,在小数据规模下相比经典 Apriori 算法最高提升 3. 6 倍,相比 CBE-Apriori 算法最高提升 1. 4 倍。在较大数据规模下相比经典 Apriori 算法最高提升 10. 41 倍,相比 CBE-Apriori 算法最高提升 11. 53 倍。

Abstract:: Aiming at the defect that the classic Apriori algorithm frequently scans the database during the iterative process,and the dataneeds to be reprocessed? ? ?after the dynamic data is updated,an improved CBEF - Apriori algorithm based on the incremental update ofbinary coding is proposed. The core idea of? ?the improved algorithm is to convert the added itemsets and transactions into binary codes,soas to convert the calculation itemsets support into the binary code bit operation process of the itemsets and the transaction database. Theimproved algorithm filters the frequent itemsets generated by the original database and the candidate item sets newly generated by the in鄄cremental database,which effectively reduces the size of the candidate item sets,improves the efficiency of the algorithm,and meets actualneeds. The experiment shows that compared with the classic Apriori algorithm and the CBE-Apriori algorithm,the improved algorithmmines the number of correct frequent itemsets without reducing,and its computational efficiency is significantly improved, which is 3. 6times higher than the classic Apriori algorithm at a small data scale. Under a larger data scale,it is up to 10. 41 times higher than theclassic Apriori algorithm,and up to 11. 53 times higher than the CBE-Apriori algorithm.

相似文献/References:

[1]项响琴汪彩梅.基于聚类高维空间算法的离群数据挖掘技术研究[J].计算机技术与发展,2010,(01):120.
　XIANG Xiang-qin,WANG Cai-mei.Study of Outlier Data Mining Based on CLIQUE Algorithm[J].,2010,(01):120.
[2]李雷丁亚丽罗红旗.基于规则约束制导的入侵检测研究[J].计算机技术与发展,2010,(03):143.
　LI Lei,DING Ya-li,LUO Hong-qi.Intrusion Detection Technology Research Based on Homing - Constraint Rule[J].,2010,(01):143.
[3]吉同路柏永飞王立松.住宅与房地产电子政务中数据挖掘的应用研究[J].计算机技术与发展,2010,(01):235.
　JI Tong-lu,BAI Yong-fei,WANG Li-song.Study and Application of Data Mining in E-government of House and Real Estate Industry[J].,2010,(01):235.
[4]杨静张楠男李建刘延明梁美红.决策树算法的研究与应用[J].计算机技术与发展,2010,(02):114.
　YANG Jing,ZHANG Nan-nan,LI Jian,et al.Research and Application of Decision Tree Algorithm[J].,2010,(01):114.
[5]赵裕啸倪志伟王园园伍章俊.SQL Server 2005数据挖掘技术在证券客户忠诚度的应用[J].计算机技术与发展,2010,(02):229.
　ZHAO Yu-xiao,NI Zhi-wei,WANG Yuan-yuan,et al.Application of Data Mining Technology of SQL Server 2005 in Customer Loyalty Model in Securities Industry[J].,2010,(01):229.
[6]张笑达徐立臻.一种改进的基于矩阵的频繁项集挖掘算法[J].计算机技术与发展,2010,(04):93.
　ZHANG Xiao-da,XU Li-zhen.An Advanced Frequent Itemsets Mining Algorithm Based on Matrix[J].,2010,(01):93.
[7]王爱平王占凤陶嗣干燕飞飞.数据挖掘中常用关联规则挖掘算法[J].计算机技术与发展,2010,(04):105.
　WANG Ai-ping,WANG Zhan-feng,TAO Si-gan,et al.Common Algorithms of Association Rules Mining in Data Mining[J].,2010,(01):105.
[8]张广路雷景生吴兴惠.一种改进的Apriori关联规则挖掘算法（英文）[J].计算机技术与发展,2010,(06):84.
　ZHANG Guang-lu,LEI Jing-sheng,WU Xing-hui.An Improved Apriori Algorithm for Mining Association Rules[J].,2010,(01):84.
[9]吴楠胡学钢.基于聚类分区的序列模式挖掘算法研究[J].计算机技术与发展,2010,(06):109.
　WU Nan,HU Xue-gang.Research on Clustering Partition-Based Approach of Sequential Pattern Mining[J].,2010,(01):109.
[10]吴青傅秀芬.水平分布数据库的正负关联规则挖掘[J].计算机技术与发展,2010,(06):113.
　WU Qing,FU Xiu-fen.Positive and Negative Association Rules Mining on Horizontally Partitioned Database[J].,2010,(01):113.
[11]刘彦戎,杨云.一种矩阵和排序索引关联规则数据挖掘算法[J].计算机技术与发展,2021,31(02):54.[doi:10. 3969 / j. issn. 1673-629X. 2021. 02. 010]
　LIU Yan-rong,YANG Yun.A Data Mining Algorithm for Matrix and Sort Index Association Rules[J].,2021,31(01):54.[doi:10. 3969 / j. issn. 1673-629X. 2021. 02. 010]

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed882
全文下载/Downloads485
评论/Comments