[1]孙启航,杨鹤标.基于编辑距离的序列聚类算法的优化[J].计算机技术与发展,2018,28(03):109-113.[doi:10.3969/ j. issn.1673-629X.2018.03.023]
 SUN Qi-hang,YANG He-biao.Optimizing of Sequence Clustering Algorithm Based on Edit Distance[J].,2018,28(03):109-113.[doi:10.3969/ j. issn.1673-629X.2018.03.023]
点击复制

基于编辑距离的序列聚类算法的优化()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
28
期数:
2018年03期
页码:
109-113
栏目:
智能、算法、系统工程
出版日期:
2018-03-10

文章信息/Info

Title:
Optimizing of Sequence Clustering Algorithm Based on Edit Distance
文章编号:
1673-629X(2018)03-0109-05
作者:
孙启航杨鹤标
江苏大学 计算机科学与通信工程学院,江苏 镇江 212013
Author(s):
SUN Qi-hangYANG He-biao
School of Computer Science and Telecommunication Engineering,Jiangsu University,Zhenjiang 212013,China
关键词:
关键词:序列聚类编辑距离二分 K 均值序列相似性
Keywords:
sequence clusteringedit distancebinary k-meanssequence similarity
分类号:
TP311
DOI:
10.3969/ j. issn.1673-629X.2018.03.023
文献标志码:
A
摘要:
现有的很多序列聚类算法都是基于“局部特征可以代表整个序列”的假设,在实际应用中不对序列局部相似性和全局相似性加以区分,这对于存在子模式的序列聚类是适用的,如基因序列和蛋白质序列。 但是对于不存在子模式的序列,如对临床行为序列、用户购买行为序列进行聚类时,用基于全局相似性度量的聚类方法更为恰当。 针对不存在子模式的序列聚类的需要,采用编辑距离作为序列相似性计算方法,在二分 K 均值算法的基础上,提出了利用编辑距离上下界以及通过前缀子序列进行剪枝的序列聚类算法 PSClu。 该算法能有效过滤编辑距离的计算量。 实验结果表明,PSClu 能有效减少编辑距离的直接计算,具有较好的聚类效率和聚类质量。
Abstract:
Many of the existing sequence clustering algorithms are based on the assumption that local features can represent the entire sequence. The local similarity and the global similarity are not distinguished in practical applications,which is suitable for sequence clustering with child patterns,such as gene sequences and protein sequences. But,when clustering the sequences without the subpatterns,such as the clinical behavior sequences,customer purchasing sequences,it is more appropriate to utilize the clustering algorithm based on global
similarity measure. To deal with these problems,with the edit distance as sequence similarity calculation method,on the basis of the binary k-means algorithm,we propose the PSClu algorithm which can effectively filter the computation of edit distance. The experiments show that PSClu can effectively reduce the direct calculation of edit distance with good clustering efficiency and quality.

相似文献/References:

[1]龚世忠 唐文忠.一种基于P2P的两阶段Web服务发现研究[J].计算机技术与发展,2010,(06):121.
 GONG Shi-zhong,TANG Wen-zhong.A Web Services Discovery Research of Two Stages Based on P2P[J].,2010,(03):121.
[2]仲红 张守奇 张瑞 方兴 李江华.基于编辑距离的远程数据库安全搜索协议[J].计算机技术与发展,2008,(09):134.
 ZHONG Hong,ZHANG Shou-qi,ZHANG Rui,et al.A Protocol for Secure Remote Database Searching Based on Edit Distance Kind[J].,2008,(03):134.
[3]薄钧戈,乔亚男,齐 琪,等.基于编辑距离的自适应反馈程序评测方法[J].计算机技术与发展,2022,32(08):135.[doi:10. 3969 / j. issn. 1673-629X. 2022. 08. 022]
 BO Jun-ge,QIAO Ya-nan,QI Qi,et al.Evaluation Method of Adaptive Feedback Program Based on Edit Distance[J].,2022,32(03):135.[doi:10. 3969 / j. issn. 1673-629X. 2022. 08. 022]

更新日期/Last Update: 2018-04-26