[1]征原,谢云. 基于划分的聚类个数与初始中心的确定方法[J].计算机技术与发展,2017,27(07):76-78.
 ZHENG Yuan,XIE Yun. A Determination Method for Clustering Numbers and Initial Centers Based on Partitioning[J].,2017,27(07):76-78.
点击复制

 基于划分的聚类个数与初始中心的确定方法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
27
期数:
2017年07期
页码:
76-78
栏目:
智能、算法、系统工程
出版日期:
2017-07-10

文章信息/Info

Title:
 A Determination Method for Clustering Numbers and Initial Centers Based on Partitioning
文章编号:
1673-629X(2017)07-0076-03
作者:
 征原谢云
 南京邮电大学 江苏省无线通信重点实验室
Author(s):
 ZHENG YuanXIE Yun
关键词:
 k均值聚类聚类个数初始聚类中心划分
Keywords:
 k-means clusteringnumber of clusteringinitial clusteringcenters partitioning
分类号:
TP311
文献标志码:
A
摘要:
 k均值聚类算法在对数据进行聚类时需要以确定的聚类个数和初始聚类中心为前提,但聚类个数是难以准确给定的,通常随机选取k个样本作为初始聚类中心,由于不同的初始聚类中心可能导致不同的聚类结果,采用随机选取初始聚类中心的方法存在着较大的盲目性,造成聚类结果极不稳定.为此,提出了一种基于划分的聚类个数与初始中心点的确定方法.该方法通过对数据空间进行划分,统计每个网格空间中数据点数目作为网格的数据密度,同时计算局部密度极大值的网格个数;按照不同的分度值对数据集进行划分,当局部密度极大值的网格个数趋于相对稳定时,将局部密度极大值的网格个数作为聚类个数,并同时获得聚类初始中心.基于机器学习数据库数据集以及随机生成的人工模拟数据集进行了仿真实验,实验结果表明,所提出的算法有效可行,具有较高的准确性.
Abstract:
 The k-means clustering algorithm needs the determined clustering number and initial clustering center before data clustering.However,the clustering number is difficult to be accurately given.Since different initial clustering centers may lead to distinct clustering results,the randomly selective method of initial clustering centers exists blindness to make clustering results very instable.Therefore,a new algorithm for determining optimal number of clusters and initial centers with partitioning has been proposed,in which partition of data space has been conducted to take the statistical number of data marker inside each grid as the data density in the grid and count the grid number with local maximum density.The data set has been partitioned according to the different index value.While the number of local maximum density grid tends to be relatively stable,it can be considered as cluster number and initial cluster centers can be acquired meanwhile.Simulation experiments for verification have been conducted with UCI data sets and random artificial data sets.The experimental results show that the proposed algorithm is effective and feasible with quite fine accuracy.

相似文献/References:

[1]李玲娟 李冰 薛明.K-MEANS算法在IDS中的应用研究[J].计算机技术与发展,2010,(07):129.
 LI Ling-juan,LI Bing,XUE Ming.Research on Application of K-MEANS Algorithm in IDS[J].,2010,(07):129.
[2]侯艳丽.融合多特征的纹理图像分割算法[J].计算机技术与发展,2012,(05):120.
 HOU Yan-li.Texture Image Segmentation Algorithm of Space Feature and Frequency Feature Fusion[J].,2012,(07):120.
[3]冯智明,苏一丹,覃华,等.基于遗传算法的聚类与协同过滤组合推荐算法[J].计算机技术与发展,2014,24(01):35.
 FENG Zhi-ming,SU Yi-dan,QIN Hua,et al.Recommendation Algorithm of Combining Clustering with Collaborative Filtering Based on Genetic Algorithm[J].,2014,24(07):35.
[4]王悦,冷泳林,鲁富宇,等.K均值聚类在高校教师评价分析中的应用研究[J].计算机技术与发展,2014,24(05):204.
 WANG Yue,LENG Yong-lin,LU Fu-yu,et al.Application Research of K-means in Analysis of College Teachers' Evaluation[J].,2014,24(07):204.
[5]张志宏,吴庆波,邵立松,等.基于飞腾平台TOE协议栈的设计与实现[J].计算机技术与发展,2014,24(07):1.
 ZHANG Zhi-hong,WU Qing-bo,SHAO Li-song,et al. Design and Implementation of TCP/IP Offload Engine Protocol Stack Based on FT Platform[J].,2014,24(07):1.
[6]梁文快,李毅. 改进的基因表达算法对航班优化排序问题研究[J].计算机技术与发展,2014,24(07):5.
 LIANG Wen-kuai,LI Yi. Research on Optimization of Flight Scheduling Problem Based on Improved Gene Expression Algorithm[J].,2014,24(07):5.
[7]黄静,王枫,谢志新,等. EAST文档管理系统的设计与实现[J].计算机技术与发展,2014,24(07):13.
 HUANG Jing,WANG Feng,XIE Zhi-xin,et al. Design and Implementation of EAST Document Management System[J].,2014,24(07):13.
[8]侯善江[],张代远[][][]. 基于样条权函数神经网络P2P流量识别方法[J].计算机技术与发展,2014,24(07):21.
 HOU Shan-jiang[],ZHANG Dai-yuan[][][]. P2P Traffic Identification Based on Spline Weight Function Neural Network[J].,2014,24(07):21.
[9]李璨,耿国华,李康,等. 一种基于三维模型的文物碎片线图生成方法[J].计算机技术与发展,2014,24(07):25.
 LI Can,GENG Guo-hua,LI Kang,et al. A Method of Obtaining Cultural Debris’ s Line Chart Based on Three-dimensional Model[J].,2014,24(07):25.
[10]翁鹤,皮德常. 混沌RBF神经网络异常检测算法[J].计算机技术与发展,2014,24(07):29.
 WENG He,PI De-chang. Chaotic RBF Neural Network Anomaly Detection Algorithm[J].,2014,24(07):29.
[11]沈超[],王斌[],孙继成[],等. 一种青光眼快速检测系统的开发及应用[J].计算机技术与发展,2016,26(04):191.
 SHEN Chao[],WANG Bin[],SUN Ji-cheng[],et al. Development and Application of a Rapid Detection System in Glaucoma[J].,2016,26(07):191.
[12]姚禹丞,宋玲,鄂驰. 同态加密的分布式K均值聚类算法研究[J].计算机技术与发展,2017,27(02):81.
 YAO Yu-cheng,SONG Ling,E Chi. Investigation on Distributed K-means Clustering Algorithm of Homomorphic Encryption[J].,2017,27(07):81.

更新日期/Last Update: 2017-08-22