[1]邓玉芳,张继福.一种基于标准差的 K-medoids 聚类算法[J].计算机技术与发展,2020,30(08):53-60.[doi:10. 3969 / j. issn. 1673-629X. 2020. 08. 009]
DENG Yu-fang,ZHANG Ji-fu.A K-medoids Clustering Algorithm Based on Standard Deviation[J].,2020,30(08):53-60.[doi:10. 3969 / j. issn. 1673-629X. 2020. 08. 009]
点击复制
一种基于标准差的 K-medoids 聚类算法(
)
《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]
- 卷:
-
30
- 期数:
-
2020年08期
- 页码:
-
53-60
- 栏目:
-
智能、算法、系统工程
- 出版日期:
-
2020-08-10
文章信息/Info
- Title:
-
A K-medoids Clustering Algorithm Based on Standard Deviation
- 文章编号:
-
1673-629X(2020)08-0046-07
- 作者:
-
邓玉芳; 张继福
-
太原科技大学 计算机科学与技术学院,山西 太原 030024
- Author(s):
-
DENG Yu-fang; ZHANG Ji-fu
-
School of Computer Science and Technology,Taiyuan University of Science and Technology,Taiyuan 030024,China
-
- 关键词:
-
K-medoids聚类算法; 初始中心点; 标准差; UCI数据集
- Keywords:
-
K-medoids clustering algorithm; initial center point; standard deviation; UCI dataset
- 分类号:
-
TP311
- DOI:
-
10. 3969 / j. issn. 1673-629X. 2020. 08. 009
- 文献标志码:
-
A
- 摘要:
-
K-medoids 聚类分析具有对孤立点敏感度较低和良好的鲁棒性等特点 , 但由于初始聚类中心的选取和中心点迭代 更新等 , 聚类精度和效率较低 。 文中根据标准差体现数据离散程度 , 定义了初始中心点候选集 , 给出了一种基于标准差的 K-medoids 聚类算法 。 该算法首先利用标准差定义了初始中心点候选集 , 并采用逐步增加的方式确定初始中心点 , 从而保 证了选取密集程度较大的样本点作初始聚类中心点 , 同时避免选取到密集程度较低的样本点尤其是孤立点作为初始中心 点 ; 其次 , 按照数据样本归属于最近的中心点的原则 , 形成初始聚类簇 , 不断更新聚类中心点 , 直到聚类误差平方和相同为 止 , 形成聚类簇 ; 最后 , 在 UCI 数据集和人工数据集上的实验验证了该聚类算法具有良好的聚类精度 、 效率和鲁棒性 。
- Abstract:
-
The K-medoids clustering algorithm has the advantages of low sensitivity to isolated points and strong robustness. However, due to the selection of initial clustering center and the iterative updating of the center point,the clustering accuracy and efficiency are low. The initial center point candidate set is defined according to the standard deviation,and a K-medoids clustering algorithm based on standard deviation is presented. Firstly,the initial center point candidate set is defined by the standard deviation,and the initial center point is determined by a stepwise increasing,which ensures the selection of dense sample points as the initial center point,and avoid the selection of dense sample points, especially isolated points,as the initial center point. Secondly,according to the principle that the data sample belongs to the nearest central point,the initial clusters is formed,and the cluster center points are continuously updated until the clustering error squares is the same to form clusters. In the end,the experiment on UCI dataset and artificial dataset validates that the proposed algorithm has better clustering accuracy,efficiency and robustness.
更新日期/Last Update:
2020-08-08