[1]王 梅,张天时,王志宝,等.基于空间投影和聚类划分的 SVR 加速算法[J].计算机技术与发展,2024,34(04):24-29.[doi:10. 3969 / j. issn. 1673-629X. 2024. 04. 004]
 WANG Mei,ZHANG Tian-shi,WANG Zhi-bao,et al.An Accelerator for SVR Algorithms Based on Spatial Projection and Clustering Partitioning[J].,2024,34(04):24-29.[doi:10. 3969 / j. issn. 1673-629X. 2024. 04. 004]
点击复制

基于空间投影和聚类划分的 SVR 加速算法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
34
期数:
2024年04期
页码:
24-29
栏目:
大数据与云计算
出版日期:
2024-04-10

文章信息/Info

Title:
An Accelerator for SVR Algorithms Based on Spatial Projection and Clustering Partitioning
文章编号:
1673-629X(2024)04-0024-06
作者:
王 梅12 张天时1 王志宝1 任怡果1
1. 东北石油大学 计算机科学与信息技术学院,黑龙江 大庆 163318;
2. 黑龙江省石油大数据与智能分析重点实验室(东北石油大学),黑龙江 大庆 163318
Author(s):
WANG Mei12 ZHANG Tian-shi1 WANG Zhi-bao1 REN Yi-guo1
1. School of Computer and Information Technology,Northeast Petroleum University,Daqing 163318,China;
2. Heilongjiang Key Laboratory of Petroleum Big Data and Intelligent Analysis ( Northeast Petroleum University) ,Daqing 163318,China
关键词:
大规模数据分治法支持向量回归主成分分析聚类
Keywords:
large-scale datadivide and rule methodsupport vector regressionprincipal components analysisclustering
分类号:
TP311
DOI:
10. 3969 / j. issn. 1673-629X. 2024. 04. 004
摘要:
数据不仅能产生价值,还对统计学的科学发展提供了动力。 随着科技的飞速发展,海量数据得以涌现,但大规模的数据会导致很多传统处理方法很难满足各领域对数据分析的需求。 面对海量数据时代学习算法的低效性,分治法通常被认为是解决这一问题最直接、最广泛使用的策略。 SVR 是一种强大的回归算法,在模式识别和数据挖掘等领域有广泛应用。 然而在处理大规模数据时,SVR 训练效率低。 为此,该文利用分治思想提出一种基于空间投影和聚类划分的 SVR加速算法( PKM-SVR) 。 利用投影向量将数据投影到二维空间;利用聚类方法将数据空间划分为 k 个互不相交的区域;在每个区域上训练 SVR 模型;利用每个区域的 SVR 模型预测落入同一区域的待识别样本。 在标准数据集上与传统的数据划分方法进行对比实验,实验结果表明该算法训练速度较快,并表现出更好的预测性能。
Abstract:
Data not only generates value, but also provides the impetus for the scientific development of statistics. With the rapiddevelopment of science and technology, massive data has emerged, but the large - scale data makes it difficult for many traditionalprocessing methods to meet the needs of data analysis in various fields. Facing the inefficiency of learning algorithms in the era ofmassive data,partitioning is usually considered as the most direct and widely used strategy to solve this problem. SVR is a powerfulregression algorithm with wide applications in the fields of pattern recognition and data mining. However,SVR is inefficient in trainingwhen dealing with large-scale data. For this reason,we propose a SVR acceleration algorithm based on spatial projection and clusteringdivision ( PKM-SVR) by utilizing the idea of partitioning. The projection vector is used to project the data into a two - dimensionalspace;the clustering method is used to divide the data space into k disjoint regions;the SVR model is trained on each region;and the SVRmodel in each region is used to predict the to - be - recognized samples that fall into the same region. Comparison experiments areconducted with the traditional data partitioning method on standard datasets,and the experimental results show that the proposed algorithmis faster to train and exhibits better prediction performance.

相似文献/References:

[1]王乐 王世卿 张静乐.基于Matlab的0—1背包问题的动态规划方法求解[J].计算机技术与发展,2006,(04):88.
 WANG Le,WANG Shi-qing,ZHANG Jing-le.DP Algorithm of Solving 0 - 1's Knapsack Problem Based on Matlab[J].,2006,(04):88.
[2]时亚南[],张太红[][],陈燕红[] 郭斌[]. 大规模非结构化数据的索引技术研究[J].计算机技术与发展,2014,24(12):109.
 SHI Ya-nan[],ZHANG Tai-hong[][],CHEN Yan-hong[],et al. Study on Large-scale Unstructured Data Indexing Technology[J].,2014,24(04):109.

更新日期/Last Update: 2024-04-10