[1]秦军[],戴新华[],童毅[],等. 基于MapReduce的SVM分类算法研究[J].计算机技术与发展,2015,25(06):87-91.
 QIN Jun[],DAI Xin-hua[],TONG Yi[],et al. Research on SVM Classification Algorithm Based on MapReduce[J].,2015,25(06):87-91.
点击复制

 基于MapReduce的SVM分类算法研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
25
期数:
2015年06期
页码:
87-91
栏目:
智能、算法、系统工程
出版日期:
2015-06-10

文章信息/Info

Title:
 Research on SVM Classification Algorithm Based on MapReduce
文章编号:
1673-629X(2015)06-0087-05
作者:
 秦军[1] 戴新华[2] 童毅[2] 林巧民[1]
 1.南京邮电大学 教育科学与技术学院;2.南京邮电大学 计算机学院
Author(s):
 QIN Jun[1] DAI Xin-hua[2]TONG Yi[2] LIN Qiao-min[1]
关键词:
 MapReduce SVM分类算法遗传算法云计算
Keywords:
 MapReduce SVM sorting algorithmgenetic algorithmcloud computing
分类号:
TP301.6
文献标志码:
A
摘要:
 云计算环境中,传统的基于MapReduce的SVM分类算法对数据集的训练是将各子节点训练后得到的支持向量进行合并,得到的分类器分类效率和准确率不理想。为此,文中提出了一种改进的训练算法,在各节点上运用遗传算法来寻找子数据集的最优核函数及参数,用得到的参数组合对子数据集进行训练得到支持向量,合并每个节点训练后的支持向量为全局支持向量,然后在各个节点上将子集与全局支持向量合并作为新的训练数据集。重复这四个步骤,直到全局支持向量不再变化时,则收敛到最优分类模型。最后,经开源云计算平台Hadoop实验验证,该算法的分类正确率比传统的分类算法有了明显提高。
Abstract:
 In cloud computing environment,the method adopted by the traditional SVM sorting algorithms based on MapReduce of train-ing data set is too simple and it just merges support vectors after nodes’ training,so the efficiency and accuracy of classifier are not very ideal. To solve the problem above,an improved training algorithm is proposed in this paper. Firstly,use the genetic algorithm to get the optimal kernel function and parameters on each node at the same time,then using the combination to train the data set for support vector, and afterwards,combining all support vectors after training as a global support vector,and then merging every data subset with global sup-port vector on each node to get a new training data set. Repeat these four steps until the global support vector no longer changes and that’ s to say,it converges to the optimal classification model. Finally,the experiment on Hadoop proves that the classification accuracy of new algorithm is improved obviously than traditional classification algorithms.

相似文献/References:

[1]张志宏,吴庆波,邵立松,等.基于飞腾平台TOE协议栈的设计与实现[J].计算机技术与发展,2014,24(07):1.
 ZHANG Zhi-hong,WU Qing-bo,SHAO Li-song,et al. Design and Implementation of TCP/IP Offload Engine Protocol Stack Based on FT Platform[J].,2014,24(06):1.
[2]梁文快,李毅. 改进的基因表达算法对航班优化排序问题研究[J].计算机技术与发展,2014,24(07):5.
 LIANG Wen-kuai,LI Yi. Research on Optimization of Flight Scheduling Problem Based on Improved Gene Expression Algorithm[J].,2014,24(06):5.
[3]黄静,王枫,谢志新,等. EAST文档管理系统的设计与实现[J].计算机技术与发展,2014,24(07):13.
 HUANG Jing,WANG Feng,XIE Zhi-xin,et al. Design and Implementation of EAST Document Management System[J].,2014,24(06):13.
[4]侯善江[],张代远[][][]. 基于样条权函数神经网络P2P流量识别方法[J].计算机技术与发展,2014,24(07):21.
 HOU Shan-jiang[],ZHANG Dai-yuan[][][]. P2P Traffic Identification Based on Spline Weight Function Neural Network[J].,2014,24(06):21.
[5]李璨,耿国华,李康,等. 一种基于三维模型的文物碎片线图生成方法[J].计算机技术与发展,2014,24(07):25.
 LI Can,GENG Guo-hua,LI Kang,et al. A Method of Obtaining Cultural Debris’ s Line Chart Based on Three-dimensional Model[J].,2014,24(06):25.
[6]翁鹤,皮德常. 混沌RBF神经网络异常检测算法[J].计算机技术与发展,2014,24(07):29.
 WENG He,PI De-chang. Chaotic RBF Neural Network Anomaly Detection Algorithm[J].,2014,24(06):29.
[7]刘茜[],荆晓远[],李文倩[],等. 基于流形学习的正交稀疏保留投影[J].计算机技术与发展,2014,24(07):34.
 LIU Qian[],JING Xiao-yuan[,LI Wen-qian[],et al. Orthogonal Sparsity Preserving Projections Based on Manifold Learning[J].,2014,24(06):34.
[8]尚福华,李想,巩淼. 基于模糊框架-产生式知识表示及推理研究[J].计算机技术与发展,2014,24(07):38.
 SHANG Fu-hua,LI Xiang,GONG Miao. Research on Knowledge Representation and Inference Based on Fuzzy Framework-production[J].,2014,24(06):38.
[9]叶偲,李良福,肖樟树. 一种去除运动目标重影的图像镶嵌方法研究[J].计算机技术与发展,2014,24(07):43.
 YE Si,LI Liang-fu,XIAO Zhang-shu. Research of an Image Mosaic Method for Removing Ghost of Moving Targets[J].,2014,24(06):43.
[10]余松平[][],蔡志平[],吴建进[],等. GSM-R信令监测选择录音系统设计与实现[J].计算机技术与发展,2014,24(07):47.
 YU Song-ping[][],CAI Zhi-ping[] WU Jian-jin[],GU Feng-zhi[]. Design and Implementation of an Optional Voice Recording System Based on GSM-R Signaling Monitoring[J].,2014,24(06):47.
[11]谢福伟,梁昌勇,马银超. 基于云计算的景区数据仓库应用研究[J].计算机技术与发展,2014,24(09):198.
 XIE Fu-wei,LIANG Chang-yong,MA Yin-chao. Research on Data Warehouse Application of Tourist Areas Data Based on Cloud Computing[J].,2014,24(06):198.
[12]高莉莎[],刘正涛[][],应毅[]. 基于应用程序的MapReduce性能优化[J].计算机技术与发展,2015,25(07):96.
 GAO Li-sha[],LIU Zheng-tao[][],YING Yi[]. Performance Optimization of MapReduce Based on Applications[J].,2015,25(06):96.
[13]李晨,杨子江,朱世伟,等. 基于Hadoop的网络舆情监控平台设计与实现[J].计算机技术与发展,2016,26(02):144.
 LI Chen,YANG Zi-jiang,ZHU Shi-wei,et al. Design and Implementation of Network Consensus Monitoring System Based on Hadoop[J].,2016,26(06):144.
[14]王刚,李盛恩. MapReduce中数据倾斜解决方法的研究[J].计算机技术与发展,2016,26(09):201.
 WANG Gang,LI Sheng-en. Research on Handling Data Skew in MapReduce[J].,2016,26(06):201.
[15]吴佳,苏丹,李环媛,等. 一种基于交互式的Hadoop作业调度算法[J].计算机技术与发展,2016,26(11):45.
 WU Jia,SU Dan,LI Huan-yuan,et al. An Job Scheduling Algorithm for Hadoop Based on Interaction[J].,2016,26(06):45.

更新日期/Last Update: 2015-07-27