[1]吴红星,王浩. 基于Apriori改进算法的企业Web日志挖掘研究[J].计算机技术与发展,2015,25(04):43-48.
 WU Hong-xing,WANG Hao. Research on Enterprise Web Log Mining Based on Improved Apriori Algorithm[J].,2015,25(04):43-48.
点击复制

 基于Apriori改进算法的企业Web日志挖掘研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
25
期数:
2015年04期
页码:
43-48
栏目:
智能、算法、系统工程
出版日期:
2015-04-10

文章信息/Info

Title:
 Research on Enterprise Web Log Mining Based on Improved Apriori Algorithm
文章编号:
1673-629X(2015)04-0043-05
作者:
 吴红星王浩
 合肥工业大学 计算机与信息学院
Author(s):
 WU Hong-xingWANG Hao
关键词:
 Web应用日志关联规则算法改进Apriori算法
Keywords:
 Web applicationslogassociation rulesimproved algorithmApriori algorithm
分类号:
TP301.6
文献标志码:
A
摘要:
 由于企业的Web日志中隐藏着大量有价值的信息,Apriori算法的缺点在于产生大量的候选集以及频繁扫描数据集,文中是基于协同门户和网站的日志信息进行研究。企业的协同门户里企业通知栏目可以随时发布企业的相关通知信息,是企业第一时间想让用户看到的。而网站里企业的新闻栏目也是想给用户展示企业的相关新闻信息和企业的经营活动信息,完成企业品牌以及企业文化的宣传等。基于协同门户和网站在企业的这点共性,文中提出了针对企业的一种改进Apriori算法,即在企业主动向访问者展现通知公告或者企业的经营新闻信息的前提下,挖掘出其他一级主栏目在访客心中的地位,以及访客对这些栏目的关注度和兴趣度,以便于企业实现如何调整其他栏目布局,更好地为企业宣传做服务,同时又能满足访问者的便捷访问,等等。文中算法改进的核心思想是减少候选集来对Apriori算法进行改进。在Aprio-ri算法的扫描过程中,某个ID不参与,当算法挖掘出最大频繁集后再将这个ID添加到最大频繁项集的每个项集中,开展关联规则的挖掘。这样在数据集的扫描次数及候选集的产生上都有较大程度的优化。对比实验结果表明,改进的Apriori算法效果明显,对企业有较强的实际应用价值。
Abstract:
 A large number of valuable information is hidden in the enterprise Web log,the disadvantage of Apriori algorithm is to produce a large number of candidate set and frequent scan data set. In this paper,study based on Web log information from collaborative Web por-tal. The enterprises collaborative Web portal can release the relevant notice of enterprise information at the announcements column at any time,which is what the enterprise want visitors to see at the first time. The Website news is to show visitors for enterprise related news, information and enterprise management activities,it’ s also to complete the enterprise brand and enterprise culture propaganda,etc. Based on the general character of collaborative Web portal,present an improved Apriori algorithm for enterprises,the enterprises show visitors announcements or business news and information actively,dig out the status of the other main column in visitors,and the degree of these columns’ attention and interest in visitors. In this way,the enterprises can adjust the other column layout,do better service for enterprise propaganda,and meet the visitors’ convenient access, etc. The core of the improved algorithm is to reduce the candidate set. In the process of scanning of Apriori algorithm,an ID is not to participate in,when the algorithm mining the maximum frequent sets and then adding the ID to the maximum frequent item sets concentration of each item,to carry out the association rules mining. There is a larger degree of optimization in the number of data sets of scanning and candidate set generation. After the contrast experiments,it shows that the improved Apriori algorithm is effective and has the strong practical application value for enterprises.

相似文献/References:

[1]李发英 朱海滨.基于Struts+Hibernate的Web应用的设计与实现[J].计算机技术与发展,2009,(04):91.
 LI Fa-ying,ZHU Hai-bin.Design and Realization of Web Application Based on Struts + Hibernate[J].,2009,(04):91.
[2]王安保 蒋文蓉 朱彬 闫季鸿.Struts框架Web应用的国际化[J].计算机技术与发展,2007,(04):189.
 WANG An-bao,JIANG Wen-rong,ZHU Bin,et al.Struts Framework Internationalization in Web Application[J].,2007,(04):189.
[3]任平红,陈矗,郑秋梅.Java中文乱码问题研究[J].计算机技术与发展,2013,(03):117.
 REN Ping-hong,CHEN Chu,ZHENG Qiu-mei.Research of Character Encoding in Java[J].,2013,(04):117.
[4]张志宏,吴庆波,邵立松,等.基于飞腾平台TOE协议栈的设计与实现[J].计算机技术与发展,2014,24(07):1.
 ZHANG Zhi-hong,WU Qing-bo,SHAO Li-song,et al. Design and Implementation of TCP/IP Offload Engine Protocol Stack Based on FT Platform[J].,2014,24(04):1.
[5]梁文快,李毅. 改进的基因表达算法对航班优化排序问题研究[J].计算机技术与发展,2014,24(07):5.
 LIANG Wen-kuai,LI Yi. Research on Optimization of Flight Scheduling Problem Based on Improved Gene Expression Algorithm[J].,2014,24(04):5.
[6]黄静,王枫,谢志新,等. EAST文档管理系统的设计与实现[J].计算机技术与发展,2014,24(07):13.
 HUANG Jing,WANG Feng,XIE Zhi-xin,et al. Design and Implementation of EAST Document Management System[J].,2014,24(04):13.
[7]侯善江[],张代远[][][]. 基于样条权函数神经网络P2P流量识别方法[J].计算机技术与发展,2014,24(07):21.
 HOU Shan-jiang[],ZHANG Dai-yuan[][][]. P2P Traffic Identification Based on Spline Weight Function Neural Network[J].,2014,24(04):21.
[8]李璨,耿国华,李康,等. 一种基于三维模型的文物碎片线图生成方法[J].计算机技术与发展,2014,24(07):25.
 LI Can,GENG Guo-hua,LI Kang,et al. A Method of Obtaining Cultural Debris’ s Line Chart Based on Three-dimensional Model[J].,2014,24(04):25.
[9]翁鹤,皮德常. 混沌RBF神经网络异常检测算法[J].计算机技术与发展,2014,24(07):29.
 WENG He,PI De-chang. Chaotic RBF Neural Network Anomaly Detection Algorithm[J].,2014,24(04):29.
[10]刘茜[],荆晓远[],李文倩[],等. 基于流形学习的正交稀疏保留投影[J].计算机技术与发展,2014,24(07):34.
 LIU Qian[],JING Xiao-yuan[,LI Wen-qian[],et al. Orthogonal Sparsity Preserving Projections Based on Manifold Learning[J].,2014,24(04):34.
[11]李洋. SSM框架在Web应用开发中的设计与实现[J].计算机技术与发展,2016,26(12):190.
 LI Yang. Design and Implementation of SSM in Web Application Development[J].,2016,26(04):190.
[12]陈春玲,张凡,余瀚.Web应用程序漏洞检测系统设计[J].计算机技术与发展,2017,27(09):101.
 CHEN Chun-ling,ZHANG Fan,YU Han. Design of Vulnerability Detection System for Web Application Program[J].,2017,27(04):101.
[13]唐新晨. 基于认知计算的就业咨询智慧服务系统[J].计算机技术与发展,2017,27(11):166.
 TANG Xin-chen. Employment Consultation Intelligent Service System Based on Cognitive Computation[J].,2017,27(04):166.

更新日期/Last Update: 2015-06-04