[1]杨洁,黄刚. 基于云计算的SPRINT算法研究[J].计算机技术与发展,2017,27(03):108-112.
 YANG Jie,HUANG Gang. Research on SPRINT Algorithm Based on Cloud Computing[J].,2017,27(03):108-112.
点击复制

 基于云计算的SPRINT算法研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
27
期数:
2017年03期
页码:
108-112
栏目:
智能、算法、系统工程
出版日期:
2017-03-10

文章信息/Info

Title:
 Research on SPRINT Algorithm Based on Cloud Computing
文章编号:
1673-629X(2017)03-0108-05
作者:
 杨洁黄刚
 南京邮电大学 计算机学院
Author(s):
 YANG JieHUANG Gang
关键词:
 云计算MapReduceSPRINT算法Gini指数
Keywords:
 cloud computingMapReduceSPRINT algorithmGini index
分类号:
TP301.6
文献标志码:
A
摘要:
 决策树是数据挖掘中非常重要的一种技术,常用来做数据分析和预测.传统的决策树算法在处理海量数据挖掘时,受到CPU和内存的限制,导致算法存在消耗时间过长,容错性差,存储量小的缺点.面对海量数据的处理,云计算在这方面具有非常多的优势.针对决策树中优秀的SPRINT算法,首先对SPRINT算法进行了优化,然后为了让优化后的算法更好地应用于云计算,对算法实现了并行化.传统的SPRINT算法在生成决策树时,会发生多值偏向问题,在生成一个节点时,通过计算两层的Gini指数来降低多值偏向的影响.在算法并行化时,通过将数据分发到各个处理器执行,然后进行汇总处理,从而减少算法执行的总时间.实验结果表明:基于云计算平台的SPRINT改进算法具有更好的分类正确率,同时算法的执行速度也得到了明显的提高.
Abstract:
 Decision tree is a very important technology in data mining,which is often used for data analysis and forecasting. When the tra-ditional decision tree algorithm is dealing with massive data mining,the CPU and memory is limited,resulting in its shortcomings like long time-consuming,poor fault tolerance and small storage capacity. Faced with massive data processing,cloud computing has a lot of advantages in this respect. It places emphasis on the good algorithm of SPRINT. First of all,it is optimized,and then parallelized in order to make the optimized algorithm better applied to cloud computing. When traditional SPRINT algorithm generates the decision tree,multi-valued bias problem will happen,and when it generates a node,through the calculation of Gini index of two layer,the effects of multi-valued bias is reduced. In parallel algorithm,through the distribution of data to the processor execution,then collecting and processing,the total time of execution is reduced. The experimental results show that the improved SPRINT algorithm based on cloud computing platform has better classification accuracy,and at the same time,its execution speed gets obvious improvement.

相似文献/References:

[1]王茜,朱志祥,史晨昱,等.应用于数据库安全保护的加解密引擎系统[J].计算机技术与发展,2014,24(01):143.
 WANG Qian[],ZHU Zhi-xiang[],SHI Chen-yu[],et al.Encryption and Decryption Engine System Applying to Database Security and Detection[J].,2014,24(03):143.
[2]陈丹伟 黄秀丽 任勋益.云计算及安全分析[J].计算机技术与发展,2010,(02):99.
 CHEN Dan-wei,HUANG Xiu-li,REN Xun-yi.Analysis of Cloud Computing and Cloud Security[J].,2010,(03):99.
[3]孙放 陈云芳 林杭锋.适用于富客户端的云计算模型[J].计算机技术与发展,2010,(08):96.
 SUN Fang,CHEN Yun-fang,LIN Hang-feng.Cloud Computing Model Applicable to Rich Client Applications[J].,2010,(03):96.
[4]郭苑 张顺颐 孙雁飞.物联网关键技术及有待解决的问题研究[J].计算机技术与发展,2010,(11):180.
 GUO Yuan,ZHANG Shun-yi,SUN Yan-fei.Research of Key Technologies and Unresolved Questions of Internet of Things[J].,2010,(03):180.
[5]李玲娟 张敏.云计算环境下关联规则挖掘算法的研究[J].计算机技术与发展,2011,(02):43.
 LI Ling-juan,ZHANG Min.Research on Algorithms of Mining Association Rule under Cloud Computing Environment[J].,2011,(03):43.
[6]王德政 申山宏 周宁宁.云计算环境下的数据存储[J].计算机技术与发展,2011,(04):81.
 WANG De-zheng,SHEN Shan-hong,ZHOU Ning-ning.Data Storage in Cloud Computing Environment[J].,2011,(03):81.
[7]宋丽华 姜家轩 张建成 田长录 马文征.黄河三角洲云计算平台关键技术的研究[J].计算机技术与发展,2011,(06):40.
 SONG Li-hua,JIANG Jia-xuan,ZHANG Jian-cheng,et al.Research of Key Technologies of Cloud Computing of Yellow River Delta[J].,2011,(03):40.
[8]田宏伟 解福 倪俊敏.云计算环境下基于粒子群算法的资源分配策略[J].计算机技术与发展,2011,(12):22.
 TIAN Hong-wei,XIE Fu,NI Jun-min.Resource Allocation Algorithm Based on Particle Swarm Algorithm in Cloud Computing Environment[J].,2011,(03):22.
[9]张慧 邢培振.云计算环境下信息安全分析[J].计算机技术与发展,2011,(12):164.
 ZHANG Hui,XING Pei-zhen.Information Security Analysis in Cloud Computing Environment[J].,2011,(03):164.
[10]张建成[] 宋丽华[] 鹿全礼[] 郭锐[] 刘永泉[].云计算方案分析研究[J].计算机技术与发展,2012,(01):165.
 ZHANG Jian-cheng,SONG Li-hua,LU Quan-li,et al.Study and Analysis of Cloud Computing Procedure[J].,2012,(03):165.
[11]王雷,陈彦先,袁哲,等. 面向预拌混凝土行业的云计算[J].计算机技术与发展,2014,24(08):14.
 WANG Lei,CHEN Yan-xian,YUAN Zhe JI Xu. Research on Cloud Computing for Ready-mixed Concrete Industry[J].,2014,24(03):14.
[12]殷小龙,李君,万明祥. 云环境下基于改进NSGA II的虚拟机调度算法[J].计算机技术与发展,2014,24(08):71.
 YIN Xiao-long,LI Jun,WAN Ming-xiang. Virtual Machines Scheduling Algorithm Based on Improved NSGA II in Cloud Environment[J].,2014,24(03):71.
[13]张也弛,周文钦,石润华. 一种面向云的大数据完整性检测协议[J].计算机技术与发展,2014,24(09):68.
 ZHANG Ye-chi,ZHOU Wen-qin,SHI Run-hua. A Big Data Integrity Checking Protocol for Cloud[J].,2014,24(03):68.
[14]徐源吾[][],王珣[][]. 基于Hadoop的智能家居信息处理平台[J].计算机技术与发展,2014,24(09):183.
 XU Yuan-wu[] [],WANG Xun[][]. nformation Processing Platform of Smart Home Based on Hadoop[J].,2014,24(03):183.
[15]谢福伟,梁昌勇,马银超. 基于云计算的景区数据仓库应用研究[J].计算机技术与发展,2014,24(09):198.
 XIE Fu-wei,LIANG Chang-yong,MA Yin-chao. Research on Data Warehouse Application of Tourist Areas Data Based on Cloud Computing[J].,2014,24(03):198.
[16]孙滔,王杉,邢军. 文献共享系统和数据共享系统的云计算平台建设[J].计算机技术与发展,2014,24(09):206.
 SUN Tao,WANG Shan,XING Jun. Construction of Cloud Computing Platform of Sci-tech Literature Sharing System and Data Sharing System[J].,2014,24(03):206.
[17]周文琼[],王乐球[],郑述招[]. 云环境下的数据库扩展策略的设计[J].计算机技术与发展,2014,24(09):213.
 ZHOU Wen-qiong[],WANG Le-qiu[],ZHENG Shu-zhao[]. Design of Database Expansion Strategy under Cloud Computing[J].,2014,24(03):213.
[18]申侃,梁昌勇,赵树平. 基于云的MIS开放式体系结构[J].计算机技术与发展,2014,24(10):21.
 SHEN Kan,LIANG Chang-yong,ZHAO Shu-ping. Open Architecture of MIS Based on Cloud[J].,2014,24(03):21.
[19]王霞俊. 云环境下一种基于能耗感知的虚拟机部署算法[J].计算机技术与发展,2014,24(10):88.
 WANG Xia-jun. A Virtual Machine Allocation Algorithm Based on Power-aware in Cloud Computing[J].,2014,24(03):88.
[20]孟蒙,茅苏. 基于云计算的可反馈负载均衡策略的研究[J].计算机技术与发展,2014,24(10):135.
 MENG Meng,MAO Su. Study on Feedback Load Balancing Strategy Based on Cloud Computing[J].,2014,24(03):135.

更新日期/Last Update: 2017-05-18