[1]刘彦戎,杨 云.一种矩阵和排序索引关联规则数据挖掘算法[J].计算机技术与发展,2021,31(02):54-59.[doi:10. 3969 / j. issn. 1673-629X. 2021. 02. 010]
 LIU Yan-rong,YANG Yun.A Data Mining Algorithm for Matrix and Sort Index Association Rules[J].,2021,31(02):54-59.[doi:10. 3969 / j. issn. 1673-629X. 2021. 02. 010]
点击复制

一种矩阵和排序索引关联规则数据挖掘算法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
31
期数:
2021年02期
页码:
54-59
栏目:
大数据分析与挖掘
出版日期:
2021-02-10

文章信息/Info

Title:
A Data Mining Algorithm for Matrix and Sort Index Association Rules
文章编号:
1673-629X(2021)02-0054-06
作者:
刘彦戎1杨 云2
1. 陕西国际商贸学院 信息工程学院,陕西 西安 712000;?
2. 陕西科技大学 电子信息与人工智能学院,陕西 西安 710021
Author(s):
LIU Yan-rong1YANG Yun2
1. School of Information and Engineering,Shaanxi Institute of International Trade & Commerce,Xi’an 712000,China;?
2. School of Electronic Information and Artificial Intelligence,Shaanxi University of Science & Technology,Xi’an 710021,China
关键词:
数据挖掘关联规则Apriori 算法矩阵算法排序索引序列标记
Keywords:
data miningassociation rulesApriori algorithmmatrix algorithmsorting indexsequence marker
分类号:
TP305
DOI:
10. 3969 / j. issn. 1673-629X. 2021. 02. 010
摘要:
在关联规则挖掘算法中,Apriori 由于多次对数据库进行扫描会产生较多的候选集,在多次扫描数据库的情况下容易产生 I/O 开销问题,? 并引起数据挖掘效率低。 矩阵关联规则在数据挖掘过程中没有删除非频繁项集,致使存在较多的无效扫描,对于挖掘效率的提高也不明显。 该文提出了一种改进的矩阵和排序索引关联规则数据挖掘算法, 首先, 删除不需要的事务和项, 通过矩阵相乘和查找表获得频繁的二项式集合,结合排序索引得到剩下的频繁 k-项集。与矩阵关联规则算法和 Apriori 算法进行比较,提出的算法可以直接查找频繁项集并对数据库进行扫描,当产生频繁项集比较多或者数据库需要进行动态更新时,该算法具有较好的可行性和执行效率。 实验表明,提出的矩阵排序索引算法很好地降低了内存的使用率和 I/O 的开销,提高了数据挖掘的效率且具有较好的可扩展性。
Abstract:
In the association rule mining algorithm,Apriori is prone to I/O overhead and low efficiency of data mining due to the fact that multiple scans of the database will generate many candidate sets. Matrix association rules do not delete infrequent item sets in? ? the data mining process,resul-ting in many invalid scans,and the improvement of mining efficiency is not obvious. An improved? ?data mining algorithm for matrix and sorted index association rules is proposed. First,unwanted transactions and items are deleted,frequent binomial sets are obtained by matrix multiplication and lookup tables,and the remaining frequent k-item sets are obtained by combining the sorted index.Compared with the matrix association rule algori-thm and Apriori algorithm,the proposed algorithm can directly find frequent item sets and scan the database. When there are more frequent item sets or the database needs to be dynamically updated, the proposed algorithm has better feasibility and execution efficiency. Experiment shows that the proposed algorithm reduces memory utilization and I/O overhead,improves data mining efficiency and has better scalability.

相似文献/References:

[1]项响琴 汪彩梅.基于聚类高维空间算法的离群数据挖掘技术研究[J].计算机技术与发展,2010,(01):120.
 XIANG Xiang-qin,WANG Cai-mei.Study of Outlier Data Mining Based on CLIQUE Algorithm[J].,2010,(02):120.
[2]吉同路 柏永飞 王立松.住宅与房地产电子政务中数据挖掘的应用研究[J].计算机技术与发展,2010,(01):235.
 JI Tong-lu,BAI Yong-fei,WANG Li-song.Study and Application of Data Mining in E-government of House and Real Estate Industry[J].,2010,(02):235.
[3]杨静 张楠男 李建 刘延明 梁美红.决策树算法的研究与应用[J].计算机技术与发展,2010,(02):114.
 YANG Jing,ZHANG Nan-nan,LI Jian,et al.Research and Application of Decision Tree Algorithm[J].,2010,(02):114.
[4]赵裕啸 倪志伟 王园园 伍章俊.SQL Server 2005数据挖掘技术在证券客户忠诚度的应用[J].计算机技术与发展,2010,(02):229.
 ZHAO Yu-xiao,NI Zhi-wei,WANG Yuan-yuan,et al.Application of Data Mining Technology of SQL Server 2005 in Customer Loyalty Model in Securities Industry[J].,2010,(02):229.
[5]张笑达 徐立臻.一种改进的基于矩阵的频繁项集挖掘算法[J].计算机技术与发展,2010,(04):93.
 ZHANG Xiao-da,XU Li-zhen.An Advanced Frequent Itemsets Mining Algorithm Based on Matrix[J].,2010,(02):93.
[6]吴楠 胡学钢.基于聚类分区的序列模式挖掘算法研究[J].计算机技术与发展,2010,(06):109.
 WU Nan,HU Xue-gang.Research on Clustering Partition-Based Approach of Sequential Pattern Mining[J].,2010,(02):109.
[7]吴青 傅秀芬.水平分布数据库的正负关联规则挖掘[J].计算机技术与发展,2010,(06):113.
 WU Qing,FU Xiu-fen.Positive and Negative Association Rules Mining on Horizontally Partitioned Database[J].,2010,(02):113.
[8]孙名松 邸明星 王湛昱.多决策树算法在P2P网络流量检测中的应用[J].计算机技术与发展,2010,(06):126.
 SUN Ming-song,DI Ming-xing,WANG Zhan-yu.Application of Decision Tree Algorithm in Traffic Detection of P2P Network[J].,2010,(02):126.
[9]孟魁杰 董莹 赵宗涛.一种基于数据挖掘的无人飞行器故障分析方法[J].计算机技术与发展,2010,(06):225.
 MENG Kui-jie,DONG Ying,ZHAO Zong-tao.A Fault Analysis Method Based on Data Mining for Unmanned Aerial Vehicle[J].,2010,(02):225.
[10]陈伟.Apriori算法的优化方法[J].计算机技术与发展,2009,(06):80.
 CHEN Wei.Method of Apriori Algorithm Optimization[J].,2009,(02):80.
[11]李雷 丁亚丽 罗红旗.基于规则约束制导的入侵检测研究[J].计算机技术与发展,2010,(03):143.
 LI Lei,DING Ya-li,LUO Hong-qi.Intrusion Detection Technology Research Based on Homing - Constraint Rule[J].,2010,(02):143.
[12]王爱平 王占凤 陶嗣干 燕飞飞.数据挖掘中常用关联规则挖掘算法[J].计算机技术与发展,2010,(04):105.
 WANG Ai-ping,WANG Zhan-feng,TAO Si-gan,et al.Common Algorithms of Association Rules Mining in Data Mining[J].,2010,(02):105.
[13]张广路 雷景生 吴兴惠.一种改进的Apriori关联规则挖掘算法(英文)[J].计算机技术与发展,2010,(06):84.
 ZHANG Guang-lu,LEI Jing-sheng,WU Xing-hui.An Improved Apriori Algorithm for Mining Association Rules[J].,2010,(02):84.
[14]文拯 梁建武 陈英.关联规则算法的研究[J].计算机技术与发展,2009,(05):56.
 WEN Zheng,LIANG Jian-wu,CHEN Ying.Research of Association Rules Algorithm[J].,2009,(02):56.
[15]王晓宇 秦锋 程泽凯 邹洪侠.关联规则挖掘技术的研究与应用[J].计算机技术与发展,2009,(05):220.
 WANG Xiao-yu,QIN Feng,CHENG Ze-kai,et al.Investigation and Application of Association Rules Mining[J].,2009,(02):220.
[16]王敏 刘希玉.Apriori算法在税务系统中的应用[J].计算机技术与发展,2009,(11):175.
 WANG Min,LIU Xi-yu.Application of Apriori Algorithm in Tax System[J].,2009,(02):175.
[17]董彩云 刘培华.数据挖掘技术在远程教育教学中的应用[J].计算机技术与发展,2009,(02):179.
 DONG Cai-yun,LIU Pei-hua.Application of Data Mining Technology in Instance Education[J].,2009,(02):179.
[18]刘军锋 李景文 陈大克 邓晓斌.一种改进的关联规则自顶向下算法[J].计算机技术与发展,2008,(02):136.
 LIU Jun-feng,LI Jing-wen,CHEN Da-ke,et al.An Improved Top to Bottom Algorithm for Mining Association Rules[J].,2008,(02):136.
[19]王伟 高亮 吴涛.基于遗传算法的长频繁项集挖掘方法[J].计算机技术与发展,2008,(04):19.
 WANG Wei,GAO Liang,WU Tao.A Method of Mining Long Frequent Itemset Based on Genetic Algorithm[J].,2008,(02):19.
[20]耿波 仲红 徐杰 闫娜娜.用关联分析法对负荷预测结果进行二次处理[J].计算机技术与发展,2008,(04):171.
 GENG Bo,ZHONG Hong,XU Jie,et al.Using Correlation Analysis to Treat Load Forecasting Results[J].,2008,(02):171.

更新日期/Last Update: 2020-02-10