[1]邓 祥,俞 璐,谢 钧,等.基于重构误差的深度聚类方法[J].计算机技术与发展,2022,32(11):30-36.[doi:10. 3969 / j. issn. 1673-629X. 2022. 11. 005]
 DENG Xiang,YU Lu,XIE Jun,et al.Deep Clustering Method Based on Reconstruction Error[J].,2022,32(11):30-36.[doi:10. 3969 / j. issn. 1673-629X. 2022. 11. 005]
点击复制

基于重构误差的深度聚类方法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
32
期数:
2022年11期
页码:
30-36
栏目:
大数据与云计算
出版日期:
2022-11-10

文章信息/Info

Title:
Deep Clustering Method Based on Reconstruction Error
文章编号:
1673-629X(2022)11-0030-07
作者:
邓 祥1 俞 璐1 谢 钧1 吕昊远1 姚昌华2
1. 陆军工程大学,江苏 南京 210001;
2. 南京信息工程大学,江苏 南京 210044
Author(s):
DENG Xiang1 YU Lu1 XIE Jun1 LYU Hao-yuan1 YAO Chang-hua2
1. Army Engineering University of PLA,Nanjing 210001,China;
2. Nanjing University of Information Science and Technology,Nanjing 210044,China
关键词:
聚类深度聚类深度学习自编码器模式识别
Keywords:
clusteringdeep clusteringdeep learningautoencoderpattern recognition
分类号:
TP181
DOI:
10. 3969 / j. issn. 1673-629X. 2022. 11. 005
摘要:
聚类是机器学习的核心任务之一,通常是在无标签条件下,依靠发掘数据潜在的结构进行聚类。 近年来,数据的复杂度越来越高,数据隐空间中存在各种冗余复杂的空间结构,传统聚类算法难以从中分离出不同簇的数据。 深度学习具有强大的特征表示和非线性逼近能力,在无监督聚类领域也显现出优越性,基于深度学习的聚类模型有效提高了各类复杂数据的聚类结果。 该文提出了一种新的端到端深度聚类模型,在自编码器框架下,构建多个不同的聚类子空间,并利用高维样本在多个子空间的低维特征重构原始样本,同时增加一个对样本进行簇预测的网络,利用预测的概率向量对不同簇的解码样本进行加权融合,通过最小化融合样本与原始样本之间的重构误差并对子空间加以约束,最终实现对高维样本的聚类。 模型同时兼顾聚类簇的子空间结构和不同簇之间的重构误差,在标准数据集上取得了较好的聚类效果。
Abstract:
Clustering,usually without label knowledge,is one of the core tasks of machine learning,which divides data by discovering thepotential structure of data. In recent years, the complexity? ? ? ? ? of? ?data is getting higher and higher, and there are various redundant andcomplex spatial structures in the data hidden space. It is difficult for traditional clustering algorithms to separate? different clusters of data.Deep learning has strong feature representation and nonlinear approximation ability,and also shows superiority in the field of unsupervisedclustering. The clustering model based on deep learning effectively improves the clustering results of different kinds of complex data. Wepropose an end - to - end deep clustering model. Under autoencoder framework, the model constructs multiple different clusteringsubspaces,and reconstructs the original samples by using the low - dimensional features of high - dimensional samples in multiplesubspaces. At the same time, the model adds a cluster prediction network, uses the predicted probability vector to weighted fuse thedecoded samples of different clusters. By minimizing the reconstruction error between the fused samples and the original samples andconstraining the subspace,the model can cluster high-dimensional samples. The model takes? ?the subspace structure of clusters and the reconstruction error between different clusters into account,and achieves ideal clustering results on standard data sets.

相似文献/References:

[1]蒋璐璐 王适 王宝成 李慧敏 李鑫慧.一种改进的标记分水岭遥感图像分割方法[J].计算机技术与发展,2010,(01):36.
 JIANG Lu-lu,WANG Shi,WANG Bao-cheng,et al.Segmentation of Remote Sensing Image Based on an Improved Labeling Watershed Algorithm[J].,2010,(11):36.
[2]张甜 罗眉 孟晓红 赵宗涛.一种基于状态特征的航天发射故障诊断技术[J].计算机技术与发展,2010,(01):93.
 ZHANG Tian,LUO Mei,MENG Xiao-hong,et al.A Technology in Fault Diagnosis of Spaceflight Launch Based on State Character[J].,2010,(11):93.
[3]王会颖 章义刚.求解聚类问题的改进人工鱼群算法[J].计算机技术与发展,2010,(03):84.
 WANG Hui-ying,ZHANG Yi-gang.An Improved Artificial Fish- Swarm Algorithm of Solving Clustering Analysis Problem[J].,2010,(11):84.
[4]赵敏 倪志伟 刘斌.K—means与朴素贝叶斯在商务智能中的应用[J].计算机技术与发展,2010,(04):179.
 ZHAO Min,NI Zhi-wei,LIU Bin.Application Research of K - Means Clustering and Naive Bayesian Algorithm in Business Intelligence[J].,2010,(11):179.
[5]吴楠 胡学钢.基于聚类分区的序列模式挖掘算法研究[J].计算机技术与发展,2010,(06):109.
 WU Nan,HU Xue-gang.Research on Clustering Partition-Based Approach of Sequential Pattern Mining[J].,2010,(11):109.
[6]耿波 仲红 徐杰 闫娜娜.用关联分析法对负荷预测结果进行二次处理[J].计算机技术与发展,2008,(04):171.
 GENG Bo,ZHONG Hong,XU Jie,et al.Using Correlation Analysis to Treat Load Forecasting Results[J].,2008,(11):171.
[7]游芳 姜建国 张坤.基于二维属性的高维数据聚类算法研究[J].计算机技术与发展,2009,(05):111.
 YOU Fang,JIANG Jian-guo,ZHANG Kun.Cluster- Algorithm Studies Based on Two- Dimensional Attribute Higher - Dimension Data[J].,2009,(11):111.
[8]刘淑英 程国建 彭方.人工神经生长细胞结构网络在医疗诊断的应用[J].计算机技术与发展,2009,(05):231.
 LIU Shu-ying,CHENG Guo-jian,PENG Fang.Applications of Growing Cell Structures of Artificial Neural Network for Medical Diagnosis[J].,2009,(11):231.
[9]范新 沈闻 丁泉勋 沈洁.基于正例和未标文档的半监督分类研究[J].计算机技术与发展,2009,(06):58.
 FAN Xin,SHEN Wen,DING Quan-xun,et al.Research on Semi- Supervised Classification Based on Positive and Unlabeled Text Document[J].,2009,(11):58.
[10]王园园 倪志伟 赵裕啸 伍章俊.基于决策树的模糊聚类评价算法及其应用[J].计算机技术与发展,2009,(09):232.
 WANG Yuan-yuan,NI Zhi-wei,ZHAO Yu-xiao,et al.Fuzzy Clustering Evaluation Algorithm Based on Decision Tree and Application[J].,2009,(11):232.

更新日期/Last Update: 2022-11-10