«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j. issn. 1673-629X. 2022. 08. 006]
点击复制

基于 LLE 和高斯混合模型的时间序列聚类()

分享到：

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:: 32
期数:: 2022年08期

页码:: 33-41

栏目:: 大数据分析与挖掘

出版日期:: 2022-08-10

文章信息/Info

Title:: Time Series Clustering Based on LLE and Gaussian Mixture Model

文章编号:: 1673-629X(2022)08-0033-09

作者:: 杨秋颖; 翁小清; 河北经贸大学信息技术学院,河北石家庄 050061

Author(s):: YANG Qiu-ying; WENG Xiao-qing; School of Information Technology,Hebei University of Economics & Business,Shijiazhuang 050061,China

关键词:: 局部线性嵌入; 高斯混合模型; 流形学习; 时间序列聚类; 深度学习

Keywords:: local linear embedding; Gaussian mixture model; manifold learning; time series clustering; deep learning

分类号:: TP311

DOI:: 10. 3969 / j. issn. 1673-629X. 2022. 08. 006

摘要:: 聚类分析是常见的数据挖掘方法,时间序列数据挖掘可以将海量时序信息转化成有组织的知识。由于时间序列具有高维度、非线性等特点,大多数聚类算法无法直接应用在原始时间序列数据上并取得令人满意的效果。研究如何在维数约简的同时尽可能多地保留数据的内蕴特征,识别代表知识的真正有趣的模式,具有重要意义。现有大多数时间序列聚类算法没有考虑数据集的局部结构,而数据集的局部结构对聚类性能有较大影响。提出一种基于局部线性嵌入(Locally Linear Embedding,LLE)和高斯混合模型( Gaussian Mixture Model,GMM) 的时间序列聚类算法。首先从保留数据集局部结构的角度,使用 LLE 将每个高维时间序列样本表示为其 k 近邻的线性组合,并在低维空间进行重构,在保持数据集局部几何结构的同时实现维数约简;然后使用 GMM 从概率分布的角度进行聚类分析。与已有方法相比,该方法在单变量时间序列聚类上具有更优的效果。

Abstract:: Cluster analysis is a common data mining method. Time series data mining can transform massive time series information intoorganized knowledge. In view of the high dimensionality, nonlinearity and other characteristics of time series,most clustering algorithmscannot be directly applied to the original time series data and achieve satisfactory results. It is important to? ?study how to retain as manyinherent features of the data as possible while reducing the dimensionality,and to identify interesting patterns that represent knowledge.Most of the existing nonlinear dimensionality reduction methods reduce the dimension from the perspective of preserving the globalfeatures and ignore the local linear features of the data set. A time series clustering algorithm based on LLE and GMM is proposed.Firstly,from the perspective of preserving local features,LLE is used to represent each sample of high-dimensional time series as? ? ? a linearcombination of its k - nearest neighbors and reconstruct it in the low - dimensional space, and dimension reduction is achieved whilepreserving the local geometric structure of data. Then,GMM is used to perform cluster analysis from the perspective of probability distribution. Compared with the existing methods,the proposed algorithm can obtain better clustering effect? in univariate time series.

相似文献/References:

[1]吴庆棋林江云.基于聚类优化GMM提高说话人识别性能的研究[J].计算机技术与发展,2009,(04):35.
　WU Qing-qi,LIN Jiang-yun.A Study on GMM Optimization with Clustering for Improving Speaker Recognition[J].,2009,(08):35.
[2]刘大鹏尾关和彦朱庆生.添加音素持续时间信息到频谱模型的说话人辨认研究[J].计算机技术与发展,2007,(05):156.
　LIU Da-peng,Kazuhiko Ozeki,ZHU Qing-sheng.Adding Phoneme Duration Information to Spectral Model ： in Speaker Identification[J].,2007,(08):156.
[3]翟继友张鹏.高斯混合模型参数估值算法的优化[J].计算机技术与发展,2011,(11):145.
　ZHAI Ji-you,ZHANG Peng.Optimization of Parameter Estimation Based on Gaussian Mixture Model[J].,2011,(08):145.
[4]赵青成谢锋朱冬梅.基于改进MFCC和短时能量的咳嗽音身份识别[J].计算机技术与发展,2012,(06):82.
　ZHAO Qing,CHENG Xie-feng,ZHU Dong-mei.Cough Sound Identification Based on Improved MFCC and Short-time Energy[J].,2012,(08):82.
[5]李燕萍张玲华.基于多时间尺度韵律特征分析的语音转换研究[J].计算机技术与发展,2012,(12):67.
　LI Yan-ping,ZHANG Ling-hua.Voice Conversion Research Based on Multi-time Scale Prosodic Feature Analysis[J].,2012,(08):67.
[6]辛月兰.基于超像素的Grabcut彩色图像分割[J].计算机技术与发展,2013,(07):48.
　XIN Yue-lan.Superpixel-based Grabcut Color Image Segmentation[J].,2013,(08):48.
[7]黄景星,吴伟隆,龙楚君,等.基于OpenCV的视频运动目标检测及其应用研究[J].计算机技术与发展,2014,24(03):15.
　HUANG Jing-xing,WU Wei-long,Long Chu-jun,et al.Study of Moving Object Detection in Video and Its Application Based on OpenCV[J].,2014,24(08):15.
[8]高蕾[],曹建忠[]. 基于可穿戴传感器的行为识别随机逼近模型[J].计算机技术与发展,2014,24(12):83.
　GAO Lei[],CAO Jian-zhong[]. Activity Recognition Using Stochastic Approximation Model Based on Wearable Sensor[J].,2014,24(08):83.
[9]蒋翠清,邵宏波. 基于MFCC与改进ACF的汽车声音识别算法研究[J].计算机技术与发展,2015,25(02):140.
　JIANG Cui-qing,SHAO Hong-bo. Research on Vehicle Audio Recognition Algorithm Based on MFCC and Improved ACF[J].,2015,25(08):140.
[10]李燕萍,林乐,陶定元. 基于GMM统计特性的电子伪装语音鉴定研究[J].计算机技术与发展,2017,27(01):103.
　LI Yan-ping,LIN Le,TAO Ding-yuan. Research on Identification of Electronic Disguised Voice Based on GMM Statistical Parameters[J].,2017,27(08):103.

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed1544
全文下载/Downloads882
评论/Comments