[1]任敬佩,邢敬创,白晓伟.基于云计算的电力大数据分析算法研究[J].计算机技术与发展,2021,31(增刊):47-51.[doi:10. 3969 / j. issn. 1673-629X. 2021. S. 009]
REN Jing-pei,XING Jing-chuang,BAI Xiao-wei.Research on Algorithm of Big Data Analysis of Power Based on Cloud Computing[J].,2021,31(增刊):47-51.[doi:10. 3969 / j. issn. 1673-629X. 2021. S. 009]
点击复制
基于云计算的电力大数据分析算法研究(
)
《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]
- 卷:
-
31
- 期数:
-
2021年增刊
- 页码:
-
47-51
- 栏目:
-
大数据分析与挖掘
- 出版日期:
-
2021-12-31
文章信息/Info
- Title:
-
Research on Algorithm of Big Data Analysis of Power Based on Cloud Computing
- 文章编号:
-
1673-629X(2021)S0047-05
- 作者:
-
任敬佩; 邢敬创; 白晓伟
-
西安思安云创科技有限公司,陕西 西安 710000
- Author(s):
-
REN Jing-pei; XING Jing-chuang; BAI Xiao-wei
-
Xi’an Sian Yunchuang Technology Co. ,Ltd. ,Xi’an 710000,China
-
- 关键词:
-
距离三角不等式; 类轮廓; 轮廓系数; 时效性; 正确率; 噪声点
- Keywords:
-
distance triangle inequality; class profile; silhouette coefficient; time-efficient; correctness rate; noise data
- 分类号:
-
TP301. 6
- DOI:
-
10. 3969 / j. issn. 1673-629X. 2021. S. 009
- 摘要:
-
为了解决传统聚类算法检测准确性低,复杂性高不适于电力大数据异常值检测的问题,提出了一种在云计算平台上基于距离三角不等式的类轮廓聚类算法处理电力异常数据。 文中首先根据三相不平衡、功率等计算分析要求,针对源数据进行降维与清洗处理,然后,利用距离三角不等式的类轮廓聚类算法计算与识别处理后的电力运行数据,最终,利用轮廓系数、簇密度、时效性和正确率为评价指标确定算法的优劣性,快速检测出孤立点和噪声数据,减少了 I / O 以及网络传输的消耗。 该算法能够有效处理任意形状的簇,一定程度上防止出现线形类或蛇形类,从而确定的最优聚类数处理企业电能质量曲线,针对不符合要求的数据,认为相应电力数据点为电力数据异常值。 该算法通过某企业的三相电流、三相电压与功率数据进行聚类分析,验证了该算法的可行性和有效性。
- Abstract:
-
In order to solve the problem of low accuracy and high complexity of traditional algorithm which is not suitable for abnormal value detection of large power data,a clustering algorithm based on distance trigonometric inequality and class profile is proposed for large power data. Firstly, according to the calculation and analysis requirements of three - phase unbalance and power, the dimension reduction and data cleaning are carried out. Secondly,clustering algorithm of class profile and distance triangle inequality calculate and recognize the processed power operating data. Finally,the silhouette coefficient,cluster density,time-efficient and correctness rate are used to determine the pros and cons of the algorithm, which quickly detects outliers and noise data and reduces the computational redundancy and operation time. The proposed algorithm can effectively process clusters of any shape,and prevent linear or serpentine from appearing to a certain extent. The optimal number of clusters determined is used for cluster processing of enterprise power quality curve. For data that does not meet the requirements,the corresponding power is considered data points are abnormal values of power data. The algorithm uses the three-phase current,three-phase voltage and power data of a certain company to perform cluster analysis,which verifies its feasibility and effectiveness.
更新日期/Last Update:
2021-09-10