[1]李春生,曹 琦,于 澍.针对小规模数据集的多模型融合算法研究[J].计算机技术与发展,2020,30(02):63-66.[doi:10. 3969 / j. issn. 1673-629X. 2020. 02. 013]
 LI Chun-sheng,CAO Qi,YU Shu.Research on Multi-model Fusion Algorithm for Small Scale Data Sets[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2020,30(02):63-66.[doi:10. 3969 / j. issn. 1673-629X. 2020. 02. 013]
点击复制

针对小规模数据集的多模型融合算法研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
30
期数:
2020年02期
页码:
63-66
栏目:
智能、算法、系统工程
出版日期:
2020-02-10

文章信息/Info

Title:
Research on Multi-model Fusion Algorithm for Small Scale Data Sets
文章编号:
1673-629X(2020)02-0063-04
作者:
李春生曹 琦于 澍
东北石油大学 计算机与信息技术学院,黑龙江 大庆 163318
Author(s):
LI Chun-shengCAO QiYU Shu
School of Computer and Information Technology,Northeast Petroleum University,Daqing 163318,China
关键词:
数据挖掘机器学习逻辑回归决策树模型融合
Keywords:
data miningmachine learninglogistic regressiondecision treemodel fusion
分类号:
TP181
DOI:
10. 3969 / j. issn. 1673-629X. 2020. 02. 013
摘要:
目前,对小规模数据集进行预测时,主要使用传统机器学习算法,但传统单一模型预测效果不能达到预期准确率, 且无法兼顾多项评价指标。 因此,文中以小规模数据集为研究对象,融合决策树、逻辑回归、支持向量机三类模型,提出了 一种多模型融合算法,并分析了其在小规模数据集上的应用效果。 首先,简述了决策树、逻辑回归和支持向量机的算法原 理;其次,使用决策树、逻辑回归和支持向量机作为基学习器并完成单独训练,将各模型输出结果用于下一阶段模型输入, 同时使用最大似然估计迭代优化参数,从而完成多模型融合过程;最后,对数据集进行分析和处理,通过实验与单一模型 进行指标对比。 实验结果表明,多模型融合算法在预测精确率、召回率、准确率等方面有明显提升。
Abstract:
At present,traditional machine learning algorithms are mainly used in the prediction of small-scale data sets,but the traditional single model cannot reach the expected accuracy in prediction effect and cannot take into account multiple evaluation indexes. Therefore, taking the small-scale data sets as research objects and integrating decision tree, logistic regression and support vector machine,we propose a multi-model fusion algorithm and analyze its application effect on small-scale data sets. Firstly,the algorithm principle of decision tree,logistic regression and support vector machine is briefly described. Secondly,decision tree,logistic regression and support vector machine are used as the base learner and the individual training is completed. The output results of each model are used for the model input in the next stage,and the maximum likelihood estimation is used for iterative optimization parameters to complete the multimodel fusion process. Finally,thedatasetsareanalyzed and processed,and the indicators are compared with thesinglemodel through experiments which show that this algorithm has a significant improvement in prediction precision,recall rate and accuracy.

相似文献/References:

[1]项响琴 汪彩梅.基于聚类高维空间算法的离群数据挖掘技术研究[J].计算机技术与发展,2010,(01):120.
 XIANG Xiang-qin,WANG Cai-mei.Study of Outlier Data Mining Based on CLIQUE Algorithm[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2010,(02):120.
[2]李雷 丁亚丽 罗红旗.基于规则约束制导的入侵检测研究[J].计算机技术与发展,2010,(03):143.
 LI Lei,DING Ya-li,LUO Hong-qi.Intrusion Detection Technology Research Based on Homing - Constraint Rule[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2010,(02):143.
[3]吉同路 柏永飞 王立松.住宅与房地产电子政务中数据挖掘的应用研究[J].计算机技术与发展,2010,(01):235.
 JI Tong-lu,BAI Yong-fei,WANG Li-song.Study and Application of Data Mining in E-government of House and Real Estate Industry[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2010,(02):235.
[4]陈全 赵文辉 李洁 江雨燕.选择性集成学习算法的研究[J].计算机技术与发展,2010,(02):87.
 CHEN Quan,ZHAO Wen-hui,LI Jie,et al.Research of Selective Ensemble Learning Algorithm[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2010,(02):87.
[5]杨静 张楠男 李建 刘延明 梁美红.决策树算法的研究与应用[J].计算机技术与发展,2010,(02):114.
 YANG Jing,ZHANG Nan-nan,LI Jian,et al.Research and Application of Decision Tree Algorithm[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2010,(02):114.
[6]赵裕啸 倪志伟 王园园 伍章俊.SQL Server 2005数据挖掘技术在证券客户忠诚度的应用[J].计算机技术与发展,2010,(02):229.
 ZHAO Yu-xiao,NI Zhi-wei,WANG Yuan-yuan,et al.Application of Data Mining Technology of SQL Server 2005 in Customer Loyalty Model in Securities Industry[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2010,(02):229.
[7]张笑达 徐立臻.一种改进的基于矩阵的频繁项集挖掘算法[J].计算机技术与发展,2010,(04):93.
 ZHANG Xiao-da,XU Li-zhen.An Advanced Frequent Itemsets Mining Algorithm Based on Matrix[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2010,(02):93.
[8]王爱平 王占凤 陶嗣干 燕飞飞.数据挖掘中常用关联规则挖掘算法[J].计算机技术与发展,2010,(04):105.
 WANG Ai-ping,WANG Zhan-feng,TAO Si-gan,et al.Common Algorithms of Association Rules Mining in Data Mining[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2010,(02):105.
[9]张广路 雷景生 吴兴惠.一种改进的Apriori关联规则挖掘算法(英文)[J].计算机技术与发展,2010,(06):84.
 ZHANG Guang-lu,LEI Jing-sheng,WU Xing-hui.An Improved Apriori Algorithm for Mining Association Rules[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2010,(02):84.
[10]吴楠 胡学钢.基于聚类分区的序列模式挖掘算法研究[J].计算机技术与发展,2010,(06):109.
 WU Nan,HU Xue-gang.Research on Clustering Partition-Based Approach of Sequential Pattern Mining[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2010,(02):109.
[11]于 澍,曹 琦,刘 涛.基于随机森林的微博互动特征分析[J].计算机技术与发展,2019,29(10):51.[doi:10. 3969 / j. issn. 1673-629X. 2019. 10. 011]
 YU Shu,CAO Qi,LIU Tao.Analysis of Interactive Characteristics of Weibo Based on Random Forest[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2019,29(02):51.[doi:10. 3969 / j. issn. 1673-629X. 2019. 10. 011]
[12]张家艳,郑建立,郑西川,等.MIMIC 数据库智能挖掘研究概述[J].计算机技术与发展,2020,30(01):144.[doi:10. 3969 / j. issn. 1673-629X. 2020. 01. 026]
 ZHANG Jia-yan,ZHENG Jian-li,ZHENG Xi-chuan,et al.Application of Artificial Intelligence Technology in MIMIC Database Mining[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2020,30(02):144.[doi:10. 3969 / j. issn. 1673-629X. 2020. 01. 026]
[13]黎洁仪,梁之彦,范绍佳,等.线上降雨灾情检测系统设计与应用[J].计算机技术与发展,2022,32(08):191.[doi:10. 3969 / j. issn. 1673-629X. 2022. 08. 031]
 LI Jie-yi,LIANG Zhi-yan,FAN Shao-jia,et al.Design and Application of Online Rainfall Disaster Detection System[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2022,32(02):191.[doi:10. 3969 / j. issn. 1673-629X. 2022. 08. 031]
[14]陈 赛,刘文杰,黄国耀,等.高维序列数据降维方法在证券市场的应用研究[J].计算机技术与发展,2023,33(04):190.[doi:10. 3969 / j. issn. 1673-629X. 2023. 04. 028]
 CHEN Sai,LIU Wen-jie,HUANG Guo-yao,et al.Research on Application of Dimension Reduction Method of High Dimensional Sequence Data in Securities Market[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2023,33(02):190.[doi:10. 3969 / j. issn. 1673-629X. 2023. 04. 028]

更新日期/Last Update: 2020-02-10