[1]郏奎奎,刘海滨.一种基于SOM 划分的FP-growth 算法[J].计算机技术与发展,2018,28(04):71-76.
 JIA Kui-kui,LIU Hai-bin.A FP-growth Algorithm Based on SOM Partition[J].10.3969/ j. issn.1673-629X.2018.04.015,2018,28(04):71-76.
点击复制

一种基于SOM 划分的FP-growth 算法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
28
期数:
2018年04期
页码:
71-76
栏目:
智能、算法、系统工程
出版日期:
2018-04-10

文章信息/Info

Title:
A FP-growth Algorithm Based on SOM Partition
文章编号:
1673-629X(2018)04-0071-06
作者:
郏奎奎刘海滨
中国航天系统科学与工程研究院,北京 100048
Author(s):
JIA Kui-kuiLIU Hai-bin
China Aerospace Systems Science and Engineering Research Institute,Beijing 100048,China
关键词:
FP-growth自组织映射数据挖掘聚类数据划分
Keywords:
FP-growthSOMdata miningclusterdata partitioning
分类号:
TP181
文献标志码:
A
摘要:
FP-growth 算法只能处理较小数据集,在面对海量数据集时显得无能为力。 对此,对 FP-growth 算法的挖掘过程进行改进,提出一种基于 SOM(self-organizing map)划分的 FP-growth 算法。 在数据预处理阶段,将原始数据中的每条事务标准化为相同维度的数据;考虑到大数据集较难处理的问题,首先利用系统抽样方法从大数据集中抽取出具有代表性的样本;由于包含频繁项的事务具有较小的欧氏距离,再对样本进行 SOM 聚类分析;根据聚类结果,将大数据集分成若干个子集,在各个子集上并行进行 FP-growth 算法挖掘。 实验结果表明,改进算法降低了内存占用量,缩短了数据挖掘时间,提高了对海量数据的处理能力和效率,并且具有较好的加速比。
Abstract:
FP-growth algorithm can only handle smaller data sets,and can,t do much in the face of massive data sets. For this,we improve the mining process of FP-growth and propose a FP-growth algorithm based on SOM partition. In the data preprocessing,each transaction in the original data is normalized to the same dimension. Considering the difficulty of large data sets processing,systematic sampling methods are used to extract representative samples from large data sets firstly. Because transactions with frequent items have smaller Euclidean distances,these samples are used to do SOM cluster analysis. The large data sets are divided into several subsets according to the clustering results. In each subset FP-growth algorithm is executed in parallel,and association rules are mined. The mining result of the subset is combined to get the total association rules. The experiments show that the improved algorithm reduces the memory consumption,shortens the time of data mining,and increases the capacity and efficiency to mass data with a good speedup.

相似文献/References:

[1]杨佳,张慧翔,罗怡,等. 基于自组织映射的安卓恶意软件分析研究[J].计算机技术与发展,2016,26(01):86.
 YANG Jia,ZHANG Hui-xiang,LUO Yi,et al. Research on Empirical Analysis of Android Malware Based on SOM[J].10.3969/ j. issn.1673-629X.2018.04.015,2016,26(04):86.
[2]丁文超,张俊宝,阴庚雷.基于 CRNN 的 CSI 动作识别[J].计算机技术与发展,2021,31(06):7.[doi:10. 3969 / j. issn. 1673-629X. 2021. 06. 002]
 DING Wen-chao,ZHANG Jun-bao,YIN Geng-lei.CSI Action Recognition Based on CRNN[J].10.3969/ j. issn.1673-629X.2018.04.015,2021,31(04):7.[doi:10. 3969 / j. issn. 1673-629X. 2021. 06. 002]

更新日期/Last Update: 2018-06-07