[1]张晓滨,母玉雪.改进的方差优化初始中心的 K-medoids 算法[J].计算机技术与发展,2020,30(07):42-45.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 010]
 ZHANG Xiao-bin,MU Yu-xue.An Improved K-medoids Algorithm for Initial Center of Variance Optimization[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2020,30(07):42-45.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 010]
点击复制

改进的方差优化初始中心的 K-medoids 算法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
30
期数:
2020年07期
页码:
42-45
栏目:
智能、算法、系统工程
出版日期:
2020-07-10

文章信息/Info

Title:
An Improved K-medoids Algorithm for Initial Center of Variance Optimization
文章编号:
1673-629X(2020)07-0042-04
作者:
张晓滨母玉雪
西安工程大学 计算机科学学院,陕西 西安 710600
Author(s):
ZHANG Xiao-binMU Yu-xue
School of Computer Science,Xi’an Polytechnic University,Xi’an 710600,China
关键词:
K-medoids 算法初始聚类中心方差优化最大距离乘积法样本密度
Keywords:
K-medoids algorithminitial cluster centervariance optimizationmaximum distance product methodsample density
分类号:
TP301
DOI:
10. 3969 / j. issn. 1673-629X. 2020. 07. 010
摘要:
针对传统 K-medoids 算法对于初值敏感、 容易陷入局部最优解、 稳定性差等缺点和方差优化初始中心的 K-medoids 聚类算法的时间复杂度较高、邻域半径不够精确等问题,提出一种改良的基于方差优化初始中心的 K-medoids 聚类算法。 该算法引入了全局方差的概念,并将其作为样本的密度参数,选择部分方差值较小的样本作为候选初始聚类中心样本集, 并利用最大距离乘积法从候选初始聚类中心样本集中选取方差值较小且距离较远的 K 个样本当作初始聚类中心, 该算法充分兼顾了初始聚类中心的分散性和代表性。 在更新簇类中心时,根据样本密度原则逐步扩大搜索范围,代替了传统的随机选取。 通过在 UCI 数据集上的实验结果表明,该算法不仅有效优化了初始聚类中心点的选取,同时也有效改进了聚类速度和聚类效果。
Abstract:
Aiming at the disadvantages of traditional K-medoids algorithm such as sensitivity to initial value,falling into local optimal solution easily,poor stability and the problems of variance optimization initial center K-medoids algorithm such as high time complexity and inaccurate neighborhood radius,we propose an improved K-medoids clustering algorithm based on the initial center of variance optimization. The concept of global variance is introduced in this algorithm and taken as a sample density parameters. Some smaller values of the variance of sample set are chosen as a candidate for the initial clustering center,and the method of maximum distance product is used to select K samples with small variance and far distance from the candidate initial clustering center set as the initial clustering center. The algorithm gives full consideration to the dispersion and representativeness of the initial clustering center. When updating the cluster center,the search scope is gradually expanded according to the sample density principle,which replaces the traditional random selection. Experimental results on UCI data set show that the proposed algorithm not only effectively optimizes the selection of initial clustering center,but also effectively improves the clustering speed and clustering effect.

相似文献/References:

[1]周爱武 陈宝楼 王琰.K-Means算法的研究与改进[J].计算机技术与发展,2012,(10):101.
 ZHOU Ai-wu,CHEN Bao-lou,WANG Yan.Research and Improvement of K-Means Algorithm[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2012,(07):101.
[2]杨永涛,李静.一种改进的K-means数字资源聚类算法[J].计算机技术与发展,2014,24(06):107.
 YANG Yong-tao[],LI Jing[].An Improved K-means Clustering Algorithm for Digital Resources[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2014,24(07):107.
[3]周爱武 于亚飞.K-Means聚类算法的研究[J].计算机技术与发展,2011,(02):62.
 ZHOU Ai-wu,YU Ya-fei.The Research about Clustering Algorithm of K-Means[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2011,(07):62.
[4]王艳娥,安 健,王红刚,等.基于医疗数据的聚类挖掘策略研究[J].计算机技术与发展,2020,30(07):66.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 015]
 WANG Yan-e,AN Jian,WANG Hong-gang,et al.Research on Clustering Mining Strategy Based on Medical Data Sets[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2020,30(07):66.[doi:10. 3969 / j. issn. 1673-629X. 2020. 07. 015]
[5]征原,谢云. 基于划分的聚类个数与初始中心的确定方法[J].计算机技术与发展,2017,27(07):76.
 ZHENG Yuan,XIE Yun. A Determination Method for Clustering Numbers and Initial Centers Based on Partitioning[J].COMPUTER TECHNOLOGY AND DEVELOPMENT,2017,27(07):76.

更新日期/Last Update: 2020-07-10