[1]黄承宁,李 莉,姜丽莉,等.基于交互基函数的数据流聚类算法研究[J].计算机技术与发展,2024,34(03):28-34.[doi:10. 3969 / j. issn. 1673-629X. 2024. 03. 005]
 HUANG Cheng-ning,LI Li,JIANG Li-li,et al.Research on Data Stream Clustering Algorithm Based on Interactive Basis Function[J].,2024,34(03):28-34.[doi:10. 3969 / j. issn. 1673-629X. 2024. 03. 005]
点击复制

基于交互基函数的数据流聚类算法研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
34
期数:
2024年03期
页码:
28-34
栏目:
大数据与云计算
出版日期:
2024-03-10

文章信息/Info

Title:
Research on Data Stream Clustering Algorithm Based on Interactive Basis Function
文章编号:
1673-629X(2024)03-0028-07
作者:
黄承宁1 李 莉1 姜丽莉1 徐平平2
1. 南京工业大学浦江学院,江苏 南京 211222;
2. 东南大学 信息科学与工程学院,江苏 南京 210096
Author(s):
HUANG Cheng-ning1 LI Li1 JIANG Li-li1 XU Ping-ping2
1. Pujiang College of Nanjing University of Technology,Nanjing 211222,China;
2. School of Information Science and Engineering,Southeast University,Nanjing 210096,China
关键词:
聚类数据流数据流聚类交互基函数模糊自适应谐振理论
Keywords:
clusterdata streamdata stream clusteringinteractive basis functionfuzzy adaptive resonance theory
分类号:
TP311. 13
DOI:
10. 3969 / j. issn. 1673-629X. 2024. 03. 005
摘要:
聚类是数据挖掘的有效工具,数据流聚类成为当前研究热点,目前很多数据流聚类算法已经被提出,但大部分算法将距离作为相似度度量标准,存在对噪点敏感问题,且聚类效果不理想。 为了增强数据流聚类算法的灵活性并提升聚类质量,该文将分数阶交互基函数( IBFs) 引入数据流聚类,结合模糊 ART 算法对其进行了扩展,生成柔性决策边策略,提出了新颖的数据流聚类算法 IBFs_ART。 该算法首先对到达的数据点根据特征之间的相关性通过预计算函数特征扩展,并对原有特征进行分数阶变换,之后再基于交互基函数进行数据流聚类。 交互基函数可生成灵活的决策边界且不需要指定软件,预计算函数可以在任何算法中实现,其可用于数据流聚类算法的任何扩展。 经过实验表明,使用 IBFs 实现了较低计算成本生成灵活决策边界来找到最优聚簇,在相同警戒参数下实现了更高聚类质量和纯度,较传统聚类算法拥有更高的聚类精度、对称度量和更小的错误率。
Abstract:
Clustering is an effective tool for data mining,and data stream clustering has become a hot topic in current research. Currently,many data stream clustering algorithms have?
been proposed,but most of them use distance as a similarity metric,which is sensitive tonoise,and not ideal in clustering effect. In order to enhance the flexibility and improve?
the clustering quality of data flow clustering algorithms,we introduce fractional order interactive basis functions ( IBFs) into data flow clustering,and combine them with fuzzy ARTalgorithm for expansion to generate flexible decision edge strategies. A novel data flow clustering algorithm,IBFs_ART,is proposed.The algorithm first expands the arrived data points through a pre calculated function based on the correlation between features, andperforms fractional transformation on the original features. Then, it clusters the?
data streams based on interactive basis functions.Interactive basis functions can generate flexible decision boundaries without specifying software. Precomputing functions?
can beimplemented in any algorithm,and can be used for any extension of data stream clustering algorithms. Experiments have shown that usingIBFs can achieve lower computational costs and generate flexible decision boundaries to find the optimal clustering, achieve higherclustering quality and purity under the same alert parameters,and?
have higher clustering accuracy,symmetry metrics,and smaller errorrates compared with traditional clustering algorithms.

相似文献/References:

[1]蒋璐璐 王适 王宝成 李慧敏 李鑫慧.一种改进的标记分水岭遥感图像分割方法[J].计算机技术与发展,2010,(01):36.
 JIANG Lu-lu,WANG Shi,WANG Bao-cheng,et al.Segmentation of Remote Sensing Image Based on an Improved Labeling Watershed Algorithm[J].,2010,(03):36.
[2]张甜 罗眉 孟晓红 赵宗涛.一种基于状态特征的航天发射故障诊断技术[J].计算机技术与发展,2010,(01):93.
 ZHANG Tian,LUO Mei,MENG Xiao-hong,et al.A Technology in Fault Diagnosis of Spaceflight Launch Based on State Character[J].,2010,(03):93.
[3]王会颖 章义刚.求解聚类问题的改进人工鱼群算法[J].计算机技术与发展,2010,(03):84.
 WANG Hui-ying,ZHANG Yi-gang.An Improved Artificial Fish- Swarm Algorithm of Solving Clustering Analysis Problem[J].,2010,(03):84.
[4]赵敏 倪志伟 刘斌.K—means与朴素贝叶斯在商务智能中的应用[J].计算机技术与发展,2010,(04):179.
 ZHAO Min,NI Zhi-wei,LIU Bin.Application Research of K - Means Clustering and Naive Bayesian Algorithm in Business Intelligence[J].,2010,(03):179.
[5]吴楠 胡学钢.基于聚类分区的序列模式挖掘算法研究[J].计算机技术与发展,2010,(06):109.
 WU Nan,HU Xue-gang.Research on Clustering Partition-Based Approach of Sequential Pattern Mining[J].,2010,(03):109.
[6]耿波 仲红 徐杰 闫娜娜.用关联分析法对负荷预测结果进行二次处理[J].计算机技术与发展,2008,(04):171.
 GENG Bo,ZHONG Hong,XU Jie,et al.Using Correlation Analysis to Treat Load Forecasting Results[J].,2008,(03):171.
[7]游芳 姜建国 张坤.基于二维属性的高维数据聚类算法研究[J].计算机技术与发展,2009,(05):111.
 YOU Fang,JIANG Jian-guo,ZHANG Kun.Cluster- Algorithm Studies Based on Two- Dimensional Attribute Higher - Dimension Data[J].,2009,(03):111.
[8]刘淑英 程国建 彭方.人工神经生长细胞结构网络在医疗诊断的应用[J].计算机技术与发展,2009,(05):231.
 LIU Shu-ying,CHENG Guo-jian,PENG Fang.Applications of Growing Cell Structures of Artificial Neural Network for Medical Diagnosis[J].,2009,(03):231.
[9]范新 沈闻 丁泉勋 沈洁.基于正例和未标文档的半监督分类研究[J].计算机技术与发展,2009,(06):58.
 FAN Xin,SHEN Wen,DING Quan-xun,et al.Research on Semi- Supervised Classification Based on Positive and Unlabeled Text Document[J].,2009,(03):58.
[10]吴众欣 钱德沛 黄泳翔.基于软件管道Actor模型的BPEL流程转化研究[J].计算机技术与发展,2009,(07):4.
 WU Zhong-xin,QIAN De-pei,HUANG Yong-xiang.Research on BPEL Process Conversion Based on Actor Model with Pipeline[J].,2009,(03):4.
[11]肖裕权 周肆清.基于粒子群优化算法的数据流聚类算法[J].计算机技术与发展,2011,(10):43.
 XIAO Yu-quan,ZHOU Si-qing.Clustering Evolving Data Streams Based on Particle Swarm Optimization[J].,2011,(03):43.

更新日期/Last Update: 2024-03-10